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Abstract 

We show that the first-order theory of structural subtyping 
of non-recursive types is decidable. 

Let E be a language consisting of function symbols (rep- 
resenting type constructors) and C a decidable structure in 
the relational language L containing a binary relation <. C 
represents primitive types; < represents a subtype ordering. 
We introduce the notion of T.-term-power of C, which gen- 
eralizes the structure arising in structural subtyping. The 
domain of the E-term-power of C is the set of E-terms over 
the set of elements of C 

We show that the decidability of the first-order theory of 
C implies the decidability of the first-order theory of the E- 
term-power of C. This result implies the decidability of the 
first-order theory of structural subtyping of non-recursive 
types. 

Our decision procedure is based on quantifier elimination 
and makes use of quantifier elimination for term algebras 
and Feferman-Vaught construction for products of decidable 
structures. 

We also explore connections between the theory of struc- 
tural subtyping of recursive types and monadic second-order 
theory of tree-like structures. In particular, we give an em- 
bedding of the monadic second-order theory of infinite bi- 
nary tree into the first-order theory of structural subtyping 
of recursive types. 

Keywords: Structural Subtyping, Quantifier Elimina- 
tion, Term Algebra, Decision Problem, Monadic Second- 
Order Logic 
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1 Introduction 

Subtyping constraints are an important technique for check- 
ing and inferring program properties, used both in type sys- 
tems and program analyses l3ll[16llI3|28l[23llEllIl[2l[20l 

This paper presents a decision procedure for the first- 
order theory of structural subtyping of non-recursive types. 
This result solves (for the case of non-recursive types) a 
problem left open in [J^. [^ provides the decidability re- 
sult for structural subtyping of only unary type constructors, 
whereas we solve the problem for any number of constructors 
of any arity. Furthermore, we do not impose any constraints 
on the subtyping relation <, it need not even be a partial or- 
der. The generality of our construction makes it potentially 
of independent interest in logic and model theory. 

We approach the problem of structural subtyping using 
quantifier elimination and, to some extent, using monadic 
second-order logic of tree-like structures. This paper makes 
the contributions: 

• we give a new presentation of Feferman-Vaught theo- 
rem for d irect products using a multisorted logic (Sec- 
3.31; for completeness we also include proof of 



tion 



qua ntifie r-elimination for boolean algebras of sets (Sec- 



tion 3.2 1 



we give a new presentation of decidability of the first- 
order theory of term algebras; the proof uses the lan- 
gua ge of both constructor and selector symbols (Sec- 
tion lO); 



as an introduction to main result, we show decidability 
of structural subtyping with one covariant binary con- 
structor and two constants (Section W|, this result does 
not rely on Feferman-Vaught technique; 

we present a new construction, term-power algebra for 
creating tree-like theories based on existing theories 
(Section Isl; 

as a central result, we prove that if the base theory 
is decidable, so is the theory of term-power with ar- 
bitrary variance of constructors; we give an effective 
decision procedure for quantifier elimination in term- 
power structure; the procedure combines elements of 
quantifier elimination in Feferman-Vaught theorem and 
quantifier elimination in term algebras (Sections [5] [m. 

we show the decidability of structural subtyping non- 
recursive types as a direct consequence of the main re- 
sult; 

we give a simple embedding of monadic second-order 
theory of infinite binary tree into the theory of struc- 
tural subtypin g of recursive types with two primitive 
types (Section 7.1 1; 



we show that structural subtyping of recursive types 
where t erms range over constant shapes is decidable 
(Section 0|; 



In addition to showing the decidability of structural sub- 
typing, our hope is to promote the important technique of 
quantifier elimination, which forms the basis of our result. 

Quantifier elimination ,22 . Section 2.7] is a fruitful tech- 
nique that was used to show decidability and classification 
of boolean algebras [461 151] decidability of term algebras 



[311 Chapter 23], [39II3UJ . with membership constraints [TU] 
and with queues ^3] , decidability of products [351 114] , [311 
Chapter 12], and algebraically closed fields [50| . 

The complexity of the decision problem for the first-order 
theory of structural subtyping has a non-elementary lower 
bound. This is a consequence of a general theorem about 
pairing functions 15, Theorem 1.2, Page 163] and applies to 
term algebras already, as observed in [391 143] . 

2 Preliminaries 

In this section we review some notions used in the this paper. 

If w is a word over some alphabet, we write \w\ for the 
length of 10. We write Wi • W2 to denote the concatenation 
of words wi and 102. 

A node u in a directed graph is a sink if v has no outgoing 
edges. A node ii in a directed graph is a source if v has no 
incoming edges. 

We write Ei = E2 to denote equality of syntactic entities 
El and £2. 

We write x to denote some sequence of variables 

Xl , . . . , Xji . 

We assume that formulas are built from propositional 
connectives A, V, ^, the remaining connectives are defined 
as shorthands. Connective ^ binds the strongest, followed 
by A and V. 

A literal L is an atomic formula A or a negation of an 
atomic formula -lA. We define complementation of a literal 
hyA^^A and ^^ = A. 

A formula tp is in prenex form if it is of the form 

QlXi QnX„.(j) 

where Qi G {V, 3} for 1 < i < n and is a quantifier free 
formula. We call a matrix of tp. 

If is a formula then FV((j!)) denotes the set of free vari- 
ables in <j!>. 

We write [xi t-^ ai, . . . ,xi^ t-^ a^] for the substitution a 
such that a{xi) — ai for 1 < i < fc. 

If is a formula and ti, . . . ,tjc terms, we write (fi[xi := 
ti, . . . ,Xk '■— tk] for the result of simultaneously substituting 
free occurrences of variables Xi with term ti, for 1 < i < k. 

We write h(i) for the height of term t. h(a) = if a is 
a constant, h(x) = if x is a variable. If /(ti, . . . , ifc) is a 
term then 



h(/(ti 



,ffc)) = l + max(h(ii),...,h(ifc)) 



We assume that all function symbols are of finite arity. If 
there are finitely many function symbols then for any non- 
negative integer k there is only a finite number of terms t 
such that h(f) < k. 

If 4>{u) is a conjunction of literals, we say that tf)' results 
from 3u.<j){u) by dropping quantified variable u iff <j)' is the 
result of eliminating from 0(ii) all conjunctions containing 
u. More generally, if t/j is a formula of form 

Qixi .. .Qu.. .QkXk- ipo 

then the result of dropping u from tp is 

Qixi . . .QkXk- V'o 

where Vo is the result of dropping u from 3u.<j)o. 

An equality is an atomic formula ti = ^2 where ti and ^2 
are terms. A disequality is negation of an equality. 



We use the usual Tarskian semantics of formulas. Unless 
otherwise stated cf) \= ip will denote that formula (j> ^ ip 
is true in a fixed relational structure that is under current 
consideration. 

Occasionally we find it convenient to work with multi- 
sorted logic, where domain is union of disjoint sets called 
sorts, and arity specifies the sorts of all operations. Con- 
stants are operations with zero arguments. Relations are 
operations that return the result in a distinguished sort bool 
interpreted over the boolean lattice {false, true} or over the 
distributive latti ce of three-valued logic {false, true, undef} 
from Section [2. 3[ |. 

A structure C of a given language L is a pair of domain 
C and the interpretation function J-] . Hence, we name op- 
erations of the structure using symbols of the language and 
the interpretation function. If C is clear from the context 
we write simply |[_] for |[_] . 

In Section |3.3| and Section [6] we use logic with several 
kinds of quantifiers. Our logic is first-order, but we give 
higher-order types to quantifiers. For example, a quantifier 

Q::{A^B)^B 

denotes a quantifier that binds variables of A sort enclosed 
within an expression of B sort and returns an expression of 
B sort. If X and Y are sets then X ^ Y denotes the set of 
all functions from A to B. When specifying the semantics 
of the quantifier Q we specify a function 

m ■■ (M ^ m) ^ m 

The semantics of an expression M of sort B takes an environ- 
ment a which is a function from variable names to elements 
of A and produces an element of B, hence |[M|cr £ IB}. We 
define the semantics of an expression Qx. M by: 

IQx. Mja = IQlh 

where h : |j4] -^ JB] is the function 

h{a) = lMj{a[x -.^ a]) 



Here 



a[x :=a](j/) 



a, a y = X 



Specifying types for quantifiers allows to express more 

Let (TA be some arbitrary dummy global environment. If 
i^ is a formula without global variables we write J-FJcta to 
denote the truth value of F; clearly |[-F](ta does not depend 
on aA and we denote it simply |[_F] when no ambiguity arises. 
We use Hubert's epsilon as a notational convenience in 
metatheory. If P{x) is a unary predicate, then ex.P{x) de- 
notes an arbitrary element d such that P(d) holds, if such 
element exists, or an arbitrary object otherwise. 

2.1 Term Algebra 

We introduce the notion of term algebra [221 Page 14] . 

Let Nat be the set of natural numbers. Let the signature 
S be a finite set of function symbols and constants and let 
ar : E ^ Nat be a function specifying arity ar(/) for every 
function symbol or constant / £ E. Let FT(E) denote the 
set of finite ground terms over signature E. We assume that 
E contains at least one constant c G E, ar(c) = 0, and at 
least one function symbol / £ E, ar(/) > 0. Therefore, 
FT(E) is countably infinite. 



Let Cons(E) be the term algebra interpretation of signa- 
ture E, defined as follows [221 Page 14]. For every / € E with 
ar(/) = k define |/I G Cons(E), with |/I : FT(E)'' -^ FT(E) 
by 

I/](ti,...,ifc) = /(ii,...,tfc) 

We will write / instead of J/] when it causes no confusion. 

2.2 Terms as Trees 

We define trees representing terms as follows. 

We use sequences of nonegative integers to denote paths 
in the tree. Let E be a signature. A tree over E is a partial 
function t from the set Nat* of paths to the set E of function 
symbols such that: 

1. if w G Nat*, X G Nat, and 4(10 ■ x) is defined, then t{w) 
is defined as well; 

2. if t{w) = / with ar(/) = fc, then 

{i I t{w ■ i) is defined } = {1, . . . ,k} 

A finite tree is a tree with a finite domain. 

2.3 First Order Structures with Partial Functions 

We make use of partial functions in our quantifier elimina- 
tion procedures. In this section we briefly describe the ap- 
proach to partial functions we chose to use; other approaches 
would work as well, see e.g. [241. 

A language of partial functions Ei contains partial func- 
tion symbols in addition to total function symbols and rela- 
tion symbols. Consider a structure with the domain A inter- 
preting a language with partial function symbols Ei. Given 
some environment a, we have \t\a G ^ U {_L} where A- ^ A 
is a special value denoting undefined results. We require the 
interpretations of total and partial function symbols to be 
strict in _L, i.e. /(ai, . . . , a^, _L, 0^+2, . . . , a^) = _L. 

We interpret atomic formulas and their negations over 
the three-valued domain {false, true, undef} using strong 
Kleene's three- valued logic [261 1241 [44]. We require that 
|J-R|(ai, . . . ,ai,_L,ai+2, ■ . ■ ,afe) = undef for every relational 
symbol R. Logical connectives in Kleene's strong three- 
valued logic are the strongest "regular" extension of the cor- 
responding connectives on the two- valued domain [26] . The 
regularity requirement means that the three-valued logic is 
a sound approximation of two- valued logic in the following 
sense. We may obtain the truth tables for three-valued logic 
by considering the truth values false, true, undef as short- 
hands for sets {false}, {true}, {false, true} and defining each 
logical operation * by: 

Si I*] S2 = {61 o 62 I fei G Si A 62 G S2} 

where o denotes the corresponding operation in the two- 
valued logic. As in a call-by-value semantics of lambda cal- 
culus, variables in the environments (a) do not range over 
_L. We interpret quantifiers as ranging over the domain A 
or its subset if the logic is multisorted; the interpretation of 
quantifiers are similarly the best regular approximations of 
the corresponding two- valued interpretations. 

These properties of Kleene's three-valued logic have the 
following important consequence. Suppose that we extend 
the definition of all partial functions to make them total 
functions on the domain A by assigning arbitrary values out- 
side the original domain. Suppose that a formula evaluates 



to an element of 6 € {false, true} in Kleene's logic. Then (j) 
evaluates to the same truth-value b in the new logic of total 
functions. This property of three-valued logic implies that 
the algorithms that we use to transform formulas with par- 
tial functions will apply even for the logic that makes all 
functions total by completing them with arbitrary elements 
of ^. 

We say that a formula -0 is well-defined iff its truth value 
is an element of {false, true}. 

Example 1 Consider the domain of real numbers. The fol- 
lowing formulas are not well-defined: 

3 = 1/0 

Va;. 1/a; > V 1/x <Q V l/x = 

The following formulas are well-defined: 

3x. l/x = 3 
Va;. 1/a; / 3 
X = V l/s > 



We say that a formula 
and write (j>i = (j)2 iff 



(f>i is equivalent to a formula 02 



bi a = 



te cr 



for all valuations cr (including those for which |[(/>i](t — 
undef). 

Sections below perform equivalence-preserving transfor- 
mations of formulas. This means that starting from a well- 
defined formula we obtain an equivalent well-defined for- 
mula. 

When doing equivalence preserving transformations it is 
useful to observe that A,V still form a distributive lattice. 
The partial order of this lattice is the chain false < undef < 
true. The element undef does not have a complement in 
the lattice; unary operation ^ does not denote the lattice 
complement. However, the following laws still hold: 

-1(2; Ay) = —ix V —•y 
-1(2; V y) = -^x A -^y 

—'—•X = X 

The properties of A,V,-i are sufficient to transform any 
quantifier- free formula into disjunction of conjunctions of lit- 
erals using the well-known straightforward technique. How- 
ever, this straightforward technique in some cases yields con- 
junctions that are not well-defined, even though the formula 
as a whole is well-defined. 

Example 2 Transforming a negation of well-defined for- 
mula: 

-<{x 7^ A (j/ = 1/a; V z = a; -I- 1)) 

may yield the following disjunction of conjunctions: 

a; = V (y / 1/a; A 2: / a; + 1) 

where y 7^ 1/a; A z 7^ a; -I- 1 is not a well-defined conjunction 
for a; = 0. 



To enable the transformation of each well-defined for- 
mula into a disjunction of well-defined conjunctions of liter- 
als, we enrich the language of function and relation symbols 
as follows. With each partial function symbol / G Ei of 
arity k = ar(/) we associate a domain description Df = 
{{xi, . . . , Xk), 4>) specifying the domain of /. Here a;i, . . . ,Xk 
are distinct variables and is an unnested conjunction of 
literals such that FV(</)) C {a;i, . . . ,Xk}- We require every 
interpretation of a first-order structure with partial function 
symbols to satisfy the following property: 



ai, 



,afc)/i 



\Xl 



tsi. 



■,Xk 



flfej 



for all ai,...,afe G A. We henceforth assume that every 
structure with partial functions is equipped with a domain 
description Df for every partial function symbol /. 

The Proposition [8] below gives an algorithm for trans- 
forming a given well-defined formula into a disjunction of 
well-defined conjunctions. We first give some definitions and 
lemmas. 

Definition 3 If ^ is a formula with free variables, a do- 
main formula for tp is a formula (f> not containing partial 
function symbols such that, for every valuation a, 

ftpjcr 7^ undef <=> M^" ~ true 

From Definition Is] we obtain the following Lemma HI 

Lemma 4 Let ip be a formula and (j) a domain formula for 
ip. Then 

tp = {tp A (j)) y (undef A ^0) 

Proof. Let a be arbitrary valuation. Let v = |[^](7. If 
V £ {true, false} then |cjf>]cr = true and 

l{ip A 0) V (undef A -^(j))\(J = 

{v A true) V (undef A false) = v. 

If w = undef then \(f>\ = false, so 

{{ip A 0) V (undef A ^0)]cr = 

(undef A false) V (undef A true) = undef. 



Observe that ^pAcpm Lemma[4]is a well-defined conjunc- 
tion. We use this property to construct domain formulas 
using partial function domain descriptions. 

Let 

Df = {{xi,...,Xk),B( A...ABlf) 

for each partial function symbol / G Ei of arity k, where 
B[,...,B f are unnested literals. If ti,...,tk are terms. 



we write -B/(ti, ■ ■ ■ ,tk) for B{[x-i 



■,Xk 



tk\. Let 



subt(t) denote the set of all subterms of term t. 

For any literal B{ti, . . . ,tn) where _B(fi, . . . ,i„) 
R{ti, . . . ,i„) or B{ti, . . . ,t„) = -^R{ti, . . . ,f„), define 

DomForm(_B(ti, . . . , i„)) — 

A B/(si,...,Sfc) 

/(si,...,Sfc)SUi<i<„subt(ti) 



(1) 



Lemma 5 Let B{ti, . . . ,t„) be a literal containing partial 
function symbols. Then Don Form (_B (f i, ... ,t„)) is a do- 
main formula for B{ti, . . . ,tn). 

Proof. Let ct be a valuation. By strictness of interpretations 
of function and predicate symbols, |[i3(ii, . . . , t„)]CT 7^ undef 
iff |[/(si, . . . , Sk)lcr 7^ _L for every subterm /(si, . ..,Sk) of 
every term ti, iff ^BUsi, . . . , Sfc)](j = true for every 1 < j < 
F and every subterm /(si, . . . , Sfc). ■ 



Lemma 6 Let B be a literal and let 

DomForm(B) = Fi A . . . A Fm- 
Then 

B^ (BAFi A... AF^) V 

Vi<i<m(undef A ^F, A DomForm(F)) 

Proof. If |[-B](T 7^ undef, then |[F;](t — true for every 
1 < i < m, and 

[[undef A ^F A DomForm(Fi)]cr = false 

so the right-hand side evaluates to |[-B|o" as well. Now 
consider the case when |[-B]cr = undef. Then there exists 
a term /(si,...,Sfc) such that |[/(si, . . . , Sfc)](T — undef. 
Because o-{x) 7^ _L for every variable x, there exists a 
term /(si, . . . , s^) such that |[/(si, . . . , Sk)}(y = undef and 
|[si|(T 7^ undef for 1 < i < k. Then there exists a formula Fp 
of form B^ (si, . . . , Sk) such that |[-B| (si, . . . , Sfc)](T — false, 
and 

Jundef A ^Fp A DomForm(Fp)]cr = undef. 

Because 

IBAFi A...AF„](j = false, 

and for every q, 

Jundef A ^F, A DomForm(F,)]cr G {undef, false}, 
the right-hand side evaluates to undef. ■ 

Lemma 7 Let 4>o(y) o.nd 4'\{y) be well-defined formulas 
whose free variables are among y and let 

V'(y) = {undei A My)) "^ (t>i{y) 

If ip{y) is well-defined for all values of variables y, then 

V'(y) = <^i(y) 
Proof. Consider any valuation a. Let 

v = lMy)V 

and 



We need to show v = v' . Because (kiv) smd ^l){y) are well- 
defined, v,v' G {false, true}. We consider two cases. 
Case 1. V = true. Then also v' = true. 

Case 2. V = false. Then v' — undef A (poiy)- Because 
v' 7^ undef, we conclude v' = false. ■ 



Proposition 8 Every well-defined quantifier-free formula 
ip can be transformed into an equivalent disjunction tp' of 
well-defined conjunctions of literals. 

Proof. Using the standard procedure, convert tp to dis- 
junction of conjunctions 

Ci V . . . V c„ 

Let Ci — B AC'i where B is a literal and let DomForm(i3) = 
Fi A . . . A F„. Replace B A C- by 

(B A Fi A . . . A F„ A CO V 
Vi<i<,„(undef A ^F A DomForm(F) A C-) 

By Lemma |6] and distributivity, the result is an equivalent 
formula. Repeat this process for every literal in Ci V. . .VC„. 
The result can be written in the form 

(undef A(f)i)W ...W (undef A (j>p) V (pp+i V ... V (j>p+q (2) 

where each (pi for l<j<p-fgisa well-defined conjunction. 
Formula H is equivalent to 

(undef A (<^i V . . . V (pp)) V (pp+i V ... V (pp+q (3) 

and is equivalent to the well-defined formula tp, so it is well- 
defined. Formulas (pi y . . . y (pp and (pp+i V ... V (pp+q are 
also well-defined. By Lemma W\ we conclude that formula 
Q is equivalent to 

(pp + l V ... V (pp+q (4) 

Because Q is a disjunction of well-defined formulas, Q is 
the desired result ip' . m 

The following proposition presents transformation to 
unnested form for the structures with equality and partial 
function symbols, building on Proposition [S] For a similar 
unnested form in the first-order logic containing only total 
function symbols, see [22] Page 58]. 

Proposition 9 Every well-defined quantifier-free formula 
1}) in a language with equality can be effectively transformed 
into an equivalent formula ip where tp is a disjunction of 
existentially quantified well-defined conjunctions of the fol- 
lowing kinds of literals: 

• R{xi, . . . , Xfc) where R is some relational symbol of ar- 
ity k and xi, . . . ,Xk are variables; 

• -^R{xi, . . . ,Xk) where R is some relational symbol of 
arity k and x\, . . . ,Xk are variables; 

• a;i = 2:2 where xi,X2 are variables; 

• X — f(xi, . . . ,Xk) where f is some partial or total func- 
tion symbol of arity k and x, xi, . . . ,Xk are variables; 

• Xi 7^ 2:2 where Xi and X2 are variables. 

Proof. Transform the formula to disjunction of well-formed 
conjunctions of literals as in the proof of Proposition IS] 

Then repeatedly perform the following transformation 
on each well-defined conjunction (p. Let A{f{xi, . . . , Xk)) be 
an atomic formula containing term f{xi, . . . ,Xk). Replace 
(pAA{f{xi,...,Xk)) with 



3a;o. <p Axo = f{xi, ...,Xk) A A{xo) 



Replace x 7^ f{xi, . . . , x^) with 

xo = f{xi,. . . ,xk) A xo / a; 

Repeat this process until the resulting conjunction (j)' is in 
unnested form, (j)' is clearly equivalent to the original con- 
junction (j) when all partial functions are well-defined. When 
some partial function is not well-defined, then both (j> and 0' 
evaluate to false, because by construction of <j!> in the proof 
of Proposition [8l each conjunction contains conjuncts that 
evaluate to false when some application of a function symbol 
is not well-defined. ■ 

Let a left-strict conjunction in Kleene logic be denoted 
by a' and defined by 

p a' g = (p A g) V (p A -^p) 

The correctness of the transformation to unnested form 
in Proposition 19] relies on the presence of conjuncts that en- 
sure that the entire conjunction evaluates to false whenever 
some term is undefined. The following Lemma \To\ enables 
transformation to unnested form in an arbitrary context, al- 
lowing the transformation to unnested form to be performed 
independently from ensuring well-definedness of conjuncts. 

Lemma 10 Let (j>{x) be a formula with free variable x and 
let t be a term possibly containing partial function symbols. 
Then 

1. (I}{t) ~ {3x.x = t A <j){x)) V (undef AVa;.^(^(a;)) ; 

2. 4>{t) = 3x. X = tA' cj>{x) ; 

3. (p{t) ^ {3x. x = t A 4){x)) V (t^t) . 

Proof. Straightforward. ■ 

Proposition [13] below shows that a simplification similar 
to one in Lemma [7] can be applied even within the scope 
of quantifiers. To show Proposition [13] we first show two 
lemmas. 

Lemma 11 For all formulas (j}o{x,y) and (j)\{x,y), 

3x. (unde^ /\(f)o{x,y))\/ (l)i[x,y) = 
(undef A 3x.(l)o(x, y)) V 3x.(j)i(x, y) 

Proof. By distributivity of quantifiers and propositional 
connectives in Kleene logic we have: 

3x. (undef A (?!)o(a;,y)) V<^i(a;,2/) = 
(3a;. undef A 00(2;, J/)) V 3a;. 01(2:, y) = 
(undef A 3a;.(jf)o(a;, y)) V 3x.(f)i{x, y) 



Lemma 12 For all formulas (j}o{x,y) and (j)\{x,y), 

Va;. (undef A (^o(a;,y)) V 0i(a;,j/) = 

(undef A Va;.(?!)o(a;,y) V(^i(a;,y)) V Va;.<j!>i(a;, j/) 



Proof. The following sequence of equivalences holds. 

Va;. (undef A (?!)o(a;,j/)) V(^i(a;,y) ^ 

^3a;.^(undef A (;/)o(a;,J/)) V (?!)i(a;,j/) ~ 

^3a:. (undef V ^<jf>o(a;,t/)) A ^(;/)i(a;,|/) = 

^3x-. (undef A ^(j!>i(a;,y)) V (^(;/)o(a:,j/) A ^<^i(a;,y)) <^ 

^ ((undef A 3x.^(t)i{x,y)) V (3a;. -^(t)o{x,y) A ^(^i(a;, y))) 

(undef VVa;.(j!>i(a;,j/)) V(Va;. 0o(a;,y) V (?!)i(a;,y)) ^ 

(undef A Vx.(j>o{x,y) V 4>i{x, y)) V Vx.(f)i{x,y) 



Proposition 13 Let cj>o{x,y) and (f>\(x,y) be well-defined 
formulas whose free variables are among y and let 



ij{y) 



JlXl . 



]„x„. {undef A (t)o{x,y)) W (t)i{x,y) 



where Qi, . . . , Qn are quantifiers. If ■ij)(y) is well-defined for 
all values of variables y, then 



■^{y) 



ia;i . . .Q„x„. (I>i{x,y) 



Proof. Applying successively Lemmas |11| and |12| to quan- 
tifiers Qn, . . . ,Qi, we conclude 

tpiy) = {undef A<j}2{y))y QiXi . ..QnX„. (j}i{x,y) 

for some formula 4>2{y). Then by Lemma [tI 

V'(y) — Qixi ...QnX„. cl>i{x,y). 



3 Some Quantifier Elimination Procedures 

As a preparation for the proof of the decidability of term 
algebras of decidable theories, we present quantifier elimina- 
tion procedures for some theories that are known to admit 
quantifier elimination. We use the results and ideas from 
this section to show the new results in Sections [41 [5] [6] 

3.1 Quantifier Elimination 

Our technique for showing decidability of structural sub- 
typing of recursive types is based on quantifier elimination. 
This section gives some general remarks on quantifier elim- 
ination. 

We follow [22] in describing quantifier elimination proce- 
dures. According to |22l Page 70, Lemma 2.7.4] it suffices 
to eliminate 3y from formulas of the form 



^y- A ^«(^'2/) 



(5) 



where a; is a tuple of variables and ipi {x, y) is a literal whose 
all variables are among x, y. The reason why eliminating 
formulas of the form (|5| suffices is the following. Suppose 
that the formula in prenex form and consider the innermost 
quantifier of a formula. Let be the subformula containing 
the quantifier and the subformula that is the scope of the 
quantifier. If (f) is of the form Vx. 0o we may replace 
with -■3a;. -100- Hence, we may assume that is of the form 



3a;. 01. We then transform 01 into disjunctive normal form 
and use the fact 



3x. (02 V 03) 



(3a;. 02) V (3a;. 03) 



(6) 



We conclude that elimination of quantifiers from formulas of 
form (|5| suffices to eliminate the innermost quantifier. By 
repeatedly eliminating innermost quantifiers we can elimi- 
nate all quantifiers from a formula. 

We may also assume that y occurs in every literal t/i.;, 
otherwise we would place the literal outside the existential 
quantifier using the fact 



3y. [AAB) 



[3y.A) A B 



for y not occurring in B. 

To eliminate variables we often use the following identity 
of a theory with equality: 



3x.a:: — t A i 



0(t) 



(7) 



Section |2.3| presents analogous identities for partial func- 
tions. 

Quantifier elimination procedures we give imply the de- 
cidability of the underlying theories. In this paper the inter- 
pretations of function and relation symbols on some domain 
A are effectively computable functions and relations on A. 
Therefore, the truth- value of every formula without vari- 
ables is computable. The quantifier elimination procedures 
we present are all effective. To determine the truth value of 
a closed formula it therefore suffices to apply the quan- 
tifier elimination procedure to 0, yielding a quantifier free 
formula ^, and then evaluate the truth value of tp. 

3.2 Quantifier Elimination for Boolean Algebras 

This section presents a quantifier elimination procedure for 
finite boolean algebras. This result dates back at least to 
[46], see also [SH |27l EH il [49] , [H Section 2.7 Exercise 3]. 
Note that the operations union, intersection and comple- 
ment are definable in the first-order language of the subset 
relation. Therefore, quantifier elimination for the first-order 
theory of the boolean algebra of sets is no harder than the 
quantifier elimination for the first-order theory of the sub- 
set relation. However, the operations of boolean algebra are 
useful in the process of quantifier elimination, so we give the 
quantifier elimination procedure for the language containing 
boolean algebra operations. 

Instead of the first-order theory of the subtype relation 
we could consider monadic second-order theory with no re- 
lation or function symbols. These two languages are equiv- 
alent because the first-order quantifiers can be eliminated 
from monad ic se cond-order theory using the subset relation 
(see Section 7.1 1. 

Finite boolean algebras are isomorphic to boolean alge- 
bras whose elements are all subsets of some finite set. We 
therefore use the symbols for the set operations as the lan- 
guage of boolean algebras. iint2, tiUt2, tl, 0, 1, correspond 
to set intersection, set union, set complement, empty set, 
and full set, respectively. We write ti C t2 for f i n t2 ~ ti, 
we write ti C t2 for the conjunction ti C f2 A ti ^ t2. 

For every nonnegative integer k we introduce formulas 
\t\ > k expressing that the set denoted by t has at least 
k elements, and formulas |i| = k expressing that the set 



denoted by t has exactly k elements. These properties are 
first-order definable as follows. 

|t| > = true 

\t\ > k+1 = 3a;. a: C t A |a;| > fc 

\t\ ^ k = \t\> k A -^\t\ > k+1 

We call a language which contains terms |tj > k and jf| — k 
the language of boolean algebras with finite cardinality con- 
straints. Because finite cardinality constraints are first-order 
definable, the language with finite cardinality constraints is 
equally expressive as the language of boolean algebras. 

Every inequality ti C t2 is equivalent to the equality 
tl r]t2 = tl, and every equality is = i4 is equivalent to the 
cardinality constraint 

I(t3nt2)u(i4ni3)l = 

It is therefore sufficient to consider the first-order formulas 
whose only atomic formulas are of the form \t\ = 0. For 
the purpose of quantifier elimination we will additionally 
consider formulas that contain atomic formulas \t\=k for all 
A; > 1, as well as \t\>k for fc > 0. 

Note that we can eliminate negative literals as follows: 

^|t| = k <;=> |t| = V ■ ■ • V |i| = fc-1 V 1*1 > fc-l-1 

^lil > k <;=> 1*1 = V ■ ■ ■ V 1*1 = fc-1 

(8) 
Every formula in the language of boolean algebras can there- 
fore be written in prenex normal form where the matrix of 
the formulas is a disjunction of conjunctions of atomic for- 
mulas of the form |t| = k and |t| > k, with no negative 
literals. 

Note that if a term t contains at least one operation of 
arity one or more, we may assume that the constants and 
1 do not appear in t, because and 1 can be simplified away. 
Furthermore, the expression |0| denotes the integer zero, so 
all terms of form |0| = fc or |0| > fc evaluate to true or false. 
We can therefore simplify every nontrivial term t so that 
it either t contains no occurrences of constants and 1, or 
t= 1. 

We next describe a quantifier elimination procedure for 
finite boolean algebras. 

We first transform the formula into prenex normal form 
and then repeatedly eliminate the innermost quantifier. As 
argued in Section [3. 1[ it suffices to show that we can elimi- 
nate an existential quantifier from any existentially quanti- 
fied conjunction of literals. Consider therefore an arbitrary 
existentially quantified conjunction of literals 

l<i<n 

where ^i is of the form 1*1 = fc or of the form 1*1 > fc. We 
assume that y occurs in every formula -0^ . It follows that no 
i/)i contains |0| or |1|. 

Let a;i , . . . , Xm , y be the set of variables occurring in for- 
mulas il)i for 1 < i < n. 

First consider the more general case m > 1. Let for 
ii,. . . ,im. G {0, 1}, 

t^i\...ijn — '^l I 1 • ' " I 1 XjYi 

where t*^ — t and *^ — t'^ . The terms in the set 
P = {*»!...«,„ I Ji,...,J™ G {0,1}} 





original formula 


eliminated form 


Jy. 


\sny\>kA Islly"] > I 


\s\ > k + l 


3y. 


\sny\= kA [sily^l > I 


\s\ > k + l 


3y. 


\sny\> kA [sily^l = I 


\s\> k + l 


3y- 


\sny\ = fc A sn y^l = / 


\s\ = k + l 



Figure 1; Rules for Eliminating Quantifiers 



form a partition; moreover every boolean algebra expression 
whose variables are among Xi can be written as a disjoint 
union of some elements of the partition P. Any boolean 
algebra expression containing y can be written, for some 
p,q > as 

(si n y) U ■ • ■ U (sp n y)U 

(ti n y'^) u • ■ • u (t, n y") 

where si, . . . ,Sp £ P are pairwise distinct elements from the 
partition and ii , . . . , t, G P are pairwise distinct elements 
from the partition. Because 

|(si n J/) U • ■ ■ U (sp n y) U (ti n y'^) U ■ ■ ■ U {tg n y")] = 

jsi n y| + ■ ■ ■ + |sp n y| + |ti n y'=| + ■ ■ ■ + \tq n y'=\ 

the constraint of form \t\ = k can be written as 

y \si n y| = fci A ■ ■ ■ A \sp r\y\=kp A 

"^'■■■'"i-ii.-M |ii n y'^l = ii A ■ ■ ■ A |t, n y-^j = Ip 

where the disjunction ranges over nonnegative integers 
ki, . . . ,kp,li, . . . ,lq > that satisfy 

fci H \- kp + li-\ + lq = k 

From (Is} it follows that we can perform a similar transfor- 
matiorifor constraints of form \t\ > k. After performing this 
transformation, we bring the formula into disjunctive nor- 
mal form and continue eliminating the existential quantifier 
separately for each disjunct, as argued in Section [3. 1| We 
may therefore assume that all conjuncts '4^i are of one of the 
forms: |s n y| = k, {sHy^l = k, |s n y| > k, and |s n y''! > k 
where s G P. 

If there are two conjuncts both of which contain jsHyl for 
the same s, then either they are contradictory or one implies 
the other. We therefore assume that for any s £ P, there is 
at most one conjunct ipt containing |s Pi y|. For analogous 
reasons we assume that for every s € P there is at most one 
conjunct ^pi containing \s n y'^j. The result of eliminating 
the variable y is then given in Figure [T] The case when a 
literal containing |s n y| does not occur is covered by the 
case |s n y| > k for fc = 0, similarly for a literal containing 
\sny% 

It remains to consider the case m = 0. Then y is the 
only variable occurring in conjuncts ipi. Every cardinality 
expression t containing only y reduces to one of |y| or \y''\. 
If there are multiple literals containing |y|, they are either 
contradictory or one implies the others. We may therefore 
assume there is at most one literal containing \y\ and at 
most one literal containing \y'^\. We eliminate quantifier by 
applying rules in Figure [l] putting formally s — 1 where 1 is 
the universal set. 



This completes the description of quantifier elimination 
from an existentially quantified conjunction. By repeating 
this process for all quantifiers we arrive at a quantifier-free 
formula ^. Hence we have the following theorem. 

Theorem 14 For every first-order formula tj) in the lan- 
guage of boolean algebras with finite cardinality constraints 
there exists a quantifier-free formula tp such that ip is a dis- 
junction of conjunctions of literals of form, \t\ > k and \t\ — k 
where t are terms of boolean algebra, the free variables of tp 
are a subset of the free variables of (f>, and ip is equivalent to 
(p on all algebras of finite sets. 

Remark 15 Now consider the case when formula cp has no 
free variables. By Theorem |14| <p is equivalent to ip where ip 
contains only terms without variables. A term without vari- 
ables in boolean algebra can always be simplified to or 1. 
Because |0| = 0, the literals with |0| reduce to true or false, 
so we may simplify them away. The expression |1| evaluates 
to the number of elements in the boolean algebra. We call 
literals |1| = fc and |1| > fc domain cardinality constraints. A 
quantifier-free formula ip can therefore be written as a prepo- 
sitional combination of domain cardinality constraints. We 
can simplify ip into a disjunction of conjunctions of domain 
cardinality constraints and transform each conjunction so 
that it contains at most one literal. The result ip' is a sin- 
gle disjunction of domain cardinality constraints. We may 
further assume that the disjunct of form |1| > fc occurs at 
most once. Therefore, the truth value of each closed boolean 
algebra formula is characterized by a set C of possible cardi- 
nalities of the domain. If ■!/;' does not contain any |1| > fc lit- 
erals, the set C is finite. Otherwise, C = Co U {fc, fc + f , . . .} 
for some fc where Co is a finite subset of{l,...,fc — 1}. 

3.3 Feferman-Vaught Theorem 

The Feferman- Vaught technique is a way of 
discovering the first-order theories of com- 
plex structures by analyzing their components. 
This description is a little vague, and in 
fact the Feferman-Vaught technique itself has 
something of a floating identity. It works 
for direct products, as we shall see. Clever 
people can make it work in other situations too. 
I page 458 



We next review Feferman-Vaught theorem for direct 
products [13] which implies that the products of structures 
with decidable first-order theories have decidable first-order 
theories. 

The result was first obtained for strong and weak pow- 
ers of theories in [33]; [35] also suggests the generalization 
to products. Our sketch here mostly follows 14 and |35j . 
see also [31 1. Chapter 12] as well as 22, Section 9.6]. Some- 
what specific to our presentation is the fact that we use a 
multisorted logic and build into the language the correspon- 
dence between formulas interpreted over C and the cylindric 
algebra of sets of positions. 

Let Lc be a relational language. Let further I be some 
nonempty finite or countably infinite index set. For each 
i € I let Ci = {Ci, I-] ) be a decidable structure interpreting 
the language Lc. 

We define direct product of the family of structures d, 
i £ I, as the structure 



where V ■ 



'). P is the set of all functions t such that 



t{i) £ Ci for i £ I, and |[_| is defined by 



irfiti 



,U) = Vi. Irf (ii«,---,ifcW) 



for each relation symbol r G ic- 



inner formula relations for r £ Lc 

r :: tuple* -^ indset 
inner logical connectives 

A , v' :: indset x indset -^ indset 
^' :: indset -^ indset 
true', false' :: indset 

inner formula quantifiers 
3',V' :: (tuple -^ indset) -^ indset 
index set equality 
= :: indset x indset -^ bool 
logical connectives 
A,V :: bool X bool -^ bool 
-^ :: bool -^ bool 
true, false :: bool 

index set quantifiers 
3'-,V'- :: (indset ^ bool) ^ bool 
tuple quantifiers 
3,V :: (tuple -> bool) -^ bool 

Figure 2: Operations in product structure 

For the purpose of quantifier elimination we consider a 
richer language of statements about product structure V. 
Figure [2] shows this richer language. The corresponding 
structure V2 = {P2, 1-]'''^) contains, in addition to the func- 
tion space P, a copy of the boolean algebra 2^ of subsets of 
the index set I. We interpret a relation r £ Lc by 

lrf-^ih,...,U)^{t\lrf'{h{i),...Mi))} 
We let Utrue'F^ = / and write 



r{ti,...,tk) 



- true 



to express |[r]"''(fi 
as v. 



,tk). Hence P2 is at least as expressive 



Note that Figure [2] does not contain an equality relation 
between tuples. If we need to express the equality between 
tuples, we assume that some binary relation ro G Lc in the 
base structure is interpreted as equality, and express the 
equality between tuples ii and f2 using the formula: 

ro(ti,t2) = true'. 

FigurelSlshows the semantics of the language in Figure[2] 
(The logic has no partial functions, so we interpret the sort 
bool over the set {true, false}.) 



inner formula relations for r G Lc 



tnt 



,tk) = {i\irf'ih{i),...Mi))} 



inner logical connectives 

IaYHAi,A2) = AiAA2 

lvYHAi,A2) = AiUA2 
hYHA) ^ I\A 
[[true'] ^2 = / 
Ifalse']^^ = 

inner formula quantifiers 

Ivr^/■ = n,gp,/w 

index set equality 

yrHAi,A2) = {A,^A2) 

logical connectives 

(interpreted as usual) 

index set quantifiers 

PT'f = UAe2'fiA) 

l^T^f = f]Ae2'f{A) 

tuple quantifiers 

prv = 3tGP/(i) 
ivr^f = vtGP/(i) 

Figure 3: Semantics of operations in product structure P2 



We let Ai <Z' A2 stand for Ai a' A2 =' A2. 

Note that the interpretations of a', v', -<\ true', false', 
=', 3'", V'" form a first-order structure of boolean algebras of 
subsets of the set I. We call formulas in this boolean algebra 
sublanguage index-set algebra formulas. 

On the other hand, relations r for r £ Lc, together with 
a', v', -'\ 3', V' form the signature of first-order logic with 
relation symbols. We call formulas built only from these 
operations inner formulas. 

Let be a an inner formula with free tuple variables 
ti, . . . ,tm and no free indset variables. Then (j> specifies a 
relation p C D™. Consider the corresponding first-order 
formula 0' interpreted in the base structure C; formula 0' 
specifies a relation p' C C™ . The following property follows 
from the semantics in Figure [3] 



p(ii, . ..,t,^) = {iel\ p'Mi), . . .,t,^(i)) } 



(9) 



Sort constraints imply that quantifiers 3 , V are only applied 
to inner formulas. Let be a formula of sort bool. By la- 
belling subformulas of sort indset with variables Ai, . . . , A„, 
we can write 6 in form S^: 



3'Ai,...,yl„. 



where 



Ai =' (^1 A ... A A„=' (j),, A 
i(){Ai,...,A„) 

4> = ^{<t>l,...,(t>n) 



Furthermore, by defining Bi , . . . , Bm to be the partition of 
true' consisting of terms of form 

Af a'...a'<" 

forpi,...,p„ € {0, 1}, we can find a formula i/>' and formulas 
</>!,..., <f>'^ such that (j)^ is equivalent to 0^ ; 



3 Bi, . . . , Bm. 

Bl =' (^'1 A ... A Bm =' (l)'m A 
V''(Bl,...,Bn) 



(10) 



and where (j>i, ■ ■ ■ , 0m evaluate to sets that form partition of 
true' for all values of free variables. (By partition of true' we 
here mean a family of pairwise disjoint sets whose union is 
true', but we do not require the sets to be non-empty.) 

Now consider a formula of form 3t.(j} where <j) is with- 
out 3,V quantifiers (but possibly contains 3',V' and 3'",V'" 
quantifiers). We transform <j) into (f)^ as described, and then 
replace 

3t. 3 Bi, . . . , Bm- 



with 



Bi =' <^'i A ... A Bm =' <^m A 

V'(Bl,...,Sn) 
3 Dl , . . . , Dm . 3 Bl, . . . , Bm ■ 

Di =' (3't.<^i) A ... A Dm=' i3't.(l,'m) A 
Bi^' Di A ... A BmC' Dm A 
partition(Bi, . . . ,B„) Aip'{Bi, . . . ,B„) 



(11) 



(12) 



where partition(i3i, . . . , _B„) denotes a boolean algebra ex- 
pression expressing that sets Bi, . . . , Bn form the partition 
of true'. 

It is easy to see that |11| and |12| are equivalent. 

By repeating this construction we eliminate all term 
quantifiers from a formula. We then eliminate all set quan- 
tifiers as in Section |3.2| For that purpose we extend the 
language with cardinality constraints. 

As the result we obtain cardinality constraints on inner 
formulas. Closed inner formulas evaluate to true' or false' 
depending on their truth value in base structure C. Hence, 
if C is decidable, so is 1^2. 

Theorem 16 (Feferman-Vaught) Let C be a decidable 
structure. Then every formula m the language of Figure^is 
equivalent on the structure P2 to a prepositional combination 
of cardinality constraints of the mdex-set boolean algebra i. e. 
formulas of form j0j > fc and \(j)\ = k where (f> is an inner 
formula. 

Example 17 Let r G Lc be a binary relation on structure 
C. Let us eliminate quantifier 3f from the formula (j}{ti,t2): 

3t.3^Ai,A2,A3. 
Ai =' r{t,ti) A A2 =' r(ti,f) A A3 = r{t2,t) A 
h'Ail =0 A h'A2l=0 Ah'AsI > 1 

We first introduce sets Bo, . . . ,B-j that form partition of 
true'. The formula is then equivalent to 01 : 

3i.3 Bo, -Bl, 52, -B3, B4, Bs, Be, B?. 
Bo 



Bi = 

B2 = 

B3 = 

B4 = 

B5 = 

B6 = 

B7 = 



r(f,fi) A'r(ti,t) A'r(i2,i) A 
^'r(f,fi) A'r-(ti,t) A'r(f2,i) A 
r(f,fi) A'^'r-(ii,t) A'r(f2,i) A 
^'r(f,fi) A'^'r(ii,t) A'r(f2,i) A 
r(f,fi) A'r(ii,t) A'^'r(f2,i) A 
^'r(f,fi) A'r-(ii,t) A'^'r(f2,i) A 
r(f,fi) A'^'r-(ti,t) A'^'r(i2,i) A 
^'r(f,fi) a' ^'r(ti,t) a' ^'r(f2,i) A 



where 

00 = 

jBil =0 A IB2I =0 A 
IB3I =0 A IB5I =0 A 
\Ba\ =0 A IB7I =0 A 
IB4J > 1 
We now eliminate the quantifier 3f from the formula 0i, 



10 



obtaining formula <j)2: 

B'-Do, Di, D2, Ds, D4, Ds, D(i, D7. 



02 = 

03 



3't. r-(t,ti) A'r(fi,i) A'r(i2,t) A 

3't. ^'r(t,ti) A'r(ii,i) a' r(t2,t) A 

3't. r{t,ti) a' ^'r(fi,i) A'r(i2,i) A 

3't. -n'r{t,ti) a' -fr(ti,t) A^ r{t2,t) A 

a't. r-(i,ii) A'r(fi,i) A'^'r(t2,t) A 

3't. ^'r(i,ii) A'r(fi,i) a' -^'r{t2,t) A 

3't. r(t,ti) a' ^'r(ti,t) a' ^'r(t2,t) A 

3't. ^'r(t,ti) a' ^'r(ti,t) A'^'r(t2,t) A 



where 

03 = 3 Bo, Bi, B2, B3, B4, B5, Be, B7. 
Bo C^ Do A ... A B7 C^ D7 A 

00 

We next apply quantifier elimination for boolean algebras 
to formula 03 and obtain formula 03: 

03 = |Z?4| > 1 A h'Do a'^'D41 =0 

Hence 0(ti,t2) is equivalent to 

3^Do,D4. 

Do =' 3't. r(t,ti) A'r(ti,t) A'r(t2,t) A 
Di =' 3't. r(t,ti) A'r(ti,t) a' ^'r(t2,t) A 



ID4I > 1 A h'A) a' ^'D4 







After substituting the definitions of Do and D4, formula 
0(ti,t2) can be written without quantifiers 3,V, 3 ,V'". 



3.4 Term Algebras 

In this section we present a q uanti fier elimination procedure 
for term algebras (see Section 2.1 1. A quantifier elimination 
procedure for term algebras implies that the first-order the- 
ory of term algebras is decidable. In the sections below we 
build on the procedure in this section to define quantifier 
elimination procedures for structural subtyping. 

The decidability of the first-order theory of term alge- 
bras follows from Mal'cev's work on locally free algebras 
[31' Chapter 23]. [391 also gives an argument for decid- 
ability of term algebra and presents a unification algorithm 
based on congruence closure [38]. Infinite trees are studied 
in [12]. [30] presents a complete axiomatization for algebra 
of finite, infinite and rational trees. A proof in the style of 
|22| for an extension of free algebra with queues is presented 
in [43]. Decidability of an extension of term algebras with 
membership tests is presented in [10] in the form of a termi- 
nating term rewriting system. Unification and disunification 



problems are special cases of decision problem for first-order 
theory of term algebras, for a survey see e.g. |45ll9). 

We believe that our proof provides some insight into 
different variations of quantifier elimination procedures for 
term algebras. Like [^ we use selector language symbols, 
but retain the usual constructor symbols as well. The ad- 
vantage of the selector language is that 3j/. 2 = f{x, y) is 
equivalent to a quantifier-free formula x = /i(z) A ls/(z). 
On the other hand, constructor symbols also increase the 
set of relations on terms definable via quantifier-free formu- 
las, which can slightly simplify quantifier-elimination pro- 
cedure, as will be seen by comparing Proposition [34] and 
Proposition |38] Compared to [^i Page 70], we find that the 
termination of our procedure is more evident and the ex- 
tension to the term-power algebra in Section [6] easier. Our 
base formulas somewhat resemble formulas arising in other 
quantifier elimination procedures |31Ullll5(7| . Our terminol- 
ogy also borrows from congruence closure graphs like those 
of [391 138] . although we are not primarily concerned with 
efficiency of the algorithm described. Term algebra is an ex- 
ample of a theory of pairing functions, and |15| shows that 
non-empty family of theories of pairing functions as non- 
elementary lower bound on time complexity. 

3.4.1 Term Algebra in Selector Language 

To facilitate quantifier elimination we use a selector lan- 
guage Sel(S) for term algebra [221 Page 61]. We define term 
algebra in selector language as a first-order structure with 
partial functions. 

The set Sel(E) contains, for every function symbol / £ 
E of arity ar(/) — k, a unary predicate Is/ C FT(E) and 
functions fi, .... ft : FT(E) -^ FT(E) such that 



Mfiti 



For every / G E and 1 < i < ar(/), expression fi{t) defined 
iff ls/(t) holds, so we let Df = {x, ls/(a;)). 

As a special case, if d is a constant, then ar(ii) — and 
ISd(t) ^=^ t = d. 

Proposition 18 For every formula 0i in the language 
Cons(E) there exists an equivalent formula 02 in the selector 
language. 

Proof Sketch. Because of the presence of equality sym- 
bol, every formula in language Cons(E) can be written in 
unnested form such that every atomic formula is of two 
forms: xi = X2, or f{xi, . . . ,Xk) = y, where y and Xi are 
variables. We keep every formula xi — X2 unchanged and 
transform each formula 

f{xi,...,xk) = y 
into the well-defined conjunction 

xi = fi{y) A ■■■A Xk = fk{y) A \sf{y) 



Note that predicates Is/ form a partition of the set of all 
terms i.e. the following formulas are valid: 



s/(t) <= 


^ 3ti,. 


..,tfc. t = f{ti,. 


.,tfe|13) 


,tk)) = 


u. 


l<i<k 


(14) 


Ht) -- 


-- i, 


-Is/W 


(15) 



Vx. V Is/ (a;) 
/es 

Vx. ^(ls/(2:) A \Sg(x)), 



for / ^ 3 



(16) 
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Figure 4: Quantifier Elimination for Term Algebra 



A constructor-selector language contains both construc- 
tor symbols / G Cons(E) and selector symbols fi £ Sel(E). 

3.4.2 Quantifier Elimination 

We proceed to quantifier elimination for term algebra. A 
schematic view of our proof is in Figure |4] The basic in- 
sight is that any quantifier-free formula can be written in a 
particular unnested form, as a disjunction of base formulas. 
Base formulas trivially permit elimination of an existential 
quantifier, yet every base formula can be converted back to 
a quantifier-free formula. 

A semi-base formula is almost the base formula, except 
that it may be cyclic. We introduce cyclicity after explaining 
the graph representation of a semi-base formula. 

Definition 19 (Semi-Base Formula) A semi-base for- 
mula /3 with 

• free variables xi, . . . , Xm, 

• internal non-parameter variables ui, . . . ,Up, and 

• internal parameter variables itp-i-i, . . . , Up+q 
is a formula of form 

3ui,. ..,u„ 

distinct(ui, . . . , u„) A 
structure(iti, . . . , u„) A 
labels(ui, . . . ,u„;xi, . . . ,Xm) 
distinct(ui, . . . ,Un) enforces that variables are distinct 

distinct(ui, . . . , u„) = A Ui ^ Uj . 

structure(ui, . . . ,«„) specifies relationships between terms 
denoted by variables: 

structure(iti, . . . ,u„) = 
p 

/\ Ui =ti{ui,... ,u„) 

8 = 1 

where each ti{ui, ... ,Un) is a term of form f(ui-^, . . . ,uif,) 
forf€T.,k = ar{f). 

\ahe\s{ui, . . . ,Un',xi, . . . ,Xm) identifies some free vari- 
ables with some parameter and non-parameter variables: 



for some function j : {1, . . . , m} — ^ {1, . . . , n}. 

We require each semi-base formula to satisfy the follow- 
ing congruence closure property: there are no two distinct 
variables Ui and Ui' such that both Ui — f{ui^, . . . ,uif,) 
and Uii — f{ui-^ , . . . , u;^. ) occur as conjuncts (j)j in formula 
structure. 

We denote by U the set of internal variables of a given 
semi-base formula, U — {iti, . . . ,u„}. 

Definition 20 A semi-base formula in selector language is 
obtained from the base formula in constructor language by 
replacing every conjunct of form 

Ui = f{ui^,...,uij 

with the well-defined conjunction 

\Sf{ui) A ui^ = fi{ui) A ■■•A ui^=fk{ui) 

A semi-base formula in selector language is clearly a well- 
formed conjunction of literals. All atomic formulas in a semi- 
base formula are unnested, in both constructor and selector 
language. 

We can represent a base formula as a labelled directed 
graph with the set of nodes U\ we call this graph graph as- 
sociated with a semi-base formula. Nodes of the graph are 
in a bijection with internal variables of the semi-base for- 
mula. We call nodes corresponding to parameter variables 
ijp+i, . . . , Up+q parameter nodes; nodes ui, . . . , Up are non- 
parameter nodes. Each non-parameter node is labelled by 
a function symbol / G E and has exactly ar(/) successors, 
with edge from Uk to ui labelled by the positive integer i 
iff fi(uk) — ui occurs in the semi-base formula written in 
selector language. A constant node is a node labelled by 
some constant symbol c G E, ar(c) = 0. A constant node 
is a sink in the graph; every sink is either a constant or a 
parameter node. In addition to the labelling by function 
symbols, each node u G C/ of the graph is labelled by zero 
or more free variables x such that equation x = u occurs in 
the semi-base formula. 

Definition 21 (Base Formula) A semi-base formula (p is 
a base formula iff the graph associated with <j) is acyclic. 

A semi-base formula whose associated graph is cyclic is un- 
satisfiable in the term algebra of finite terms. Checking the 
cyclicity of a base formula corresponds to occur-check in 
unification algorithms (see e.g. [291111] ). 

Definition 22 By height 7i{u) of a node u in the acyclic 
graph we mean the length of the longest path starting from 



A node u is sink iff 7i{u) 



0. 



Iabels(?ii, 



, Ufi , 3^ 1 , . . . ^ Xm ) 



A 



Definition 23 We say that an internal variable ui is a 
source variable of a base formula /3 iff ui is represented by 
a node that is source in the directed acyclic graph corre- 
sponding to p. Equivalently, if P is written in the selector 
language, then ui is a source variable iff jS contains no equa- 
tions of form ui — fi{uk). 

Definition 24 If Ui and Uj are internal variables, we write 
Ui ^^* Uj if there is a path in the underlying graph from node 
Ui to node Uj . Equivalently, Ui ^^* Uj iff there exists a term 
t{ui) in the selector language such that |= /3 => Mj = t{ui). 
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Relation >— >* is a partial order on internal variables of j3. 

The following Lemma [25] is similar to the Independence 
of Disequations Lemma in e.g. [101 Page 178]. 

Lemma 25 Let (3 be a base formula of the form 



3ui, 



Po 



where Up+i, . . . , Up+q are parameter variables of P, and /3o is 
quantifier- free. Let 5p+i, . . . , Sp+q be infinite sets of terms. 
Then there exists a valuation a such that |/3olo" = true and 
\ui\a e Si for p+1 <i < p + q. 

Proof. To construct a assign first the values to parameter 
variables, as follows. Let ha be the length of the longest 
path in the graph associated with /?. Pick a{up+i) G Sp+i so 
that h{a{up+i)) > ha, and for each i where p + 2 < i < p + q 
pick a{ui) G Si so that h{a{ui)) > h{a{ui-i)) + he. The 
set of heights of an infinite set of terms is infinite, so it is 
always possible to choose such a{ui). 

Next consider internal nodes ui, . . . , Up+q in some topo- 
logical order. For each non-parameter node Ui such 
that Ui — fiui-^, . . . ,ui^) occurs in /3o, let (j{ui) — 
f{a{ui.),...,a{ui^)). 

Finally assign the values to free variables by (j{x) — (j{u) 
where x = u occurs in /3o. 

By construction, Jstructurejcr = true and Jlabelsjcr = 
true. It remains to show Jdistinctja = true i.e. (j{ui) 7^ <y(uj) 
for 1< i,j < p + q, i ^ j. We show this property of a 
by induction on m = min{TL{ui),T-L{uj)). Without loss of 
generality we assume H{ui) < H{uj). 

Consider first the case m = 0. Then u; is a parameter 
or a constant node. 

If Ui is a constant and Uj is a non-parameter variable 
then Ui and Uj are labelled by different function symbols so 
a{u^) ^a{uj). 

If Ui is a constant and Uj is a parameter variable then 
h{a{ui)) — whereas h{a{uj)) > ha > 0. 

Consider the case where Ui is a parameter variable and 
Uj is a non-parameter variable. Let 

J — {ji j Uj-^ is a parameter variable s.t. Uj ^^* Uj-^} 

If J = 0, then /3o uniquely specifies iy{uj), and 

h(cr(«j)) = Hiuj) <hG < h{(j{ui)) 

Let J 7^ and jo = max J. If i < jo, then 

h{a{u,)) < h(a(ujj) < h(a(u,)) 

If jo < i then 

h{a{uj)) < h{a{uja)) + hG < h{a{uj„+i)) < h{a{u^)) 

Now consider the case tti > 0. Ui and Uj axe non- 
parameter nodes, so let Ui = f{ui-^^, . . . ,Ui^) and uj — 
g{uj-^ , . . . ,Uj,). li f ^ g then clearly a{ui) 7^ o{uj). Other- 
wise, by congruence closure property of base formulas, there 
exists d such that Ui^ 7^ Uj^ . Then by induction hypothesis 
o"("»d) / (^Kd). so (j{ui) / <j{uj). ■ 

Corollary 26 Every base formula is satisfiable. 

Proposition 27 (Quantification of Base Formula) // 

/3 is a base formula and x a free variable m j3, then there 
exists a base formula Pi equivalent to 3x.p. 



Proof. Consider a formula 3x.p where /3 is a base formula. 
The only place where x occurs in /3 is a:: = Us^ in the subfor- 
mula labels. By dropping the conjunct x = u^^ from P we 
obtain a base formula Pi where Pi is equivalent to 3x./3. ■ 

Proposition 28 (Quantifier-Free to Base) Every well- 
defined quantifier-free formula in constructor-selector lan- 
guage can be written as true, false, or a disjunction of base 
formulas. 

Proof Sketch. Let </> be a well-defined quantifier-free 
fornmla in constructor-selector language. By Proposition [S] 
we can transform (f> into an equivalent formula in disjunctive 
normal form 

tpi y ■ ■ ■ y tpp 

where each i/>i is a well-defined conjunction of literals. Con- 
sider an arbitrary tpi. There exists an unnested quantifier- 
free formula ip'i with additional fresh free variables xi, . . . ,Xq 
such that tpi is equivalent to 



3xi, 



tp'i 



By distributivity and ([ml it suffices to transform each con- 
junction of unnested formulas into disjunction of base for- 
mulas. In the sequel we will assume transformations based 
on distributivity and ([6| are applied whenever we transform 
conjunction of literals into a formula containing disjunction. 
We also assume that every equation f{xi, . . . ,x„) = y is 
replaced by the equivalent one y — f{xi, . . . ,Xn) and every 
equation fi{x) = y is replace by y = fi{x). 

Because of our assumption that E is finite, we can elim- 
inate every literal of form -ils/(a;) using the equivalence 



n|S/(x) 



V i^s(^) 



(17) 



9es\{/} 



which follows from (16 1. We then transform formula back 
into disjunctive normal form and propagate the existential 
quantifiers to the conjunctions of literals. We may therefore 
assume that there are no literals of form -115/(2;) in the con- 
junction. Furthermore, \sf{x) A \sg{x) <=4> false for f ^ g, 
so we may assume that for variable x there is at most one 
literal ls/(j:) for some /. If fi{x) occurs in the conjunction, 
because the conjunction is well-defined, we may always add 
the conjunct ls/(3;). This way we ensure that exactly one 
literal of form Is/ (a;) occurs in the conjunction. 

We next ensure that every variable has either none or 
all of its components named by variables. If the conjunction 
contains literal ls/(a;) but does not contain x — f{xi, . . . , Xn) 
and does not contain an equation of form y = fi(x) for 
every J, 1 < i < 3f{f), we introduce a fresh existentially 
quantified variable for each i such that a term of form y = 
fi{x) does not appear in the conjunction. At this point 
we may transform the entire conjunction into constructor 
language by replacing 

ls/(u,) A vi^ = fi(ui) A ■■■A vi^ = fk{ui) 

with Ui = f{vii, ...,vi^)ioik = ar(/). 

We next ensure that for every two variables xi and X2 

occurring in the conjunction exactly one of the conjunct 
X2 or xi 7^ X2 is present. Namely if both conjuncts 
X2 and xi 7^ X2 are present, the conjunction is false. 



Xl 
Xl 



If none of the conjuncts is present, we insert the disjunction 
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Xi = a::2 V a;i 7^ X2 as one of the conjuncts and transform 
the result into disjunction of existentially quantified con- 
junctions. 

We next perform congruence closure for finite terms [38] 
on the resulting conjunction, using the fact that equality is 
reflexive, symmetric, transitive and congruent with respect 
to free operations / G Cons(E) and that t(x) 7^ x for every 
term t ^ x. Syntactically, the result of congruence closure 
can be viewed as adding new equations to the conjunction. 
If the congruence closure procedure establishes that the for- 
mula is unsatisfiable, the result is false. Otherwise, all vari- 
ables are grouped into equivalence classes, li a, ui — U2 
occurs in the conjunction where both «i and M2 are internal 
variables, we replace ui with U2 in the formula and elim- 
inate the existential quantifier. If for some free variable x 
there is no internal variable u such that conjunction x = u 
occurs, we introduce a new existentially quantified variable 
and a conjunct x = u. These transformations ensure that 
for every equivalence class there exists exactly one internal 
variable in the formula. It is now easy to pick representative 
conjuncts from the conjunction to obtain conjunction of the 
syntactic form in Definition [l9] of semi-base formula. The 
resulting formula is a base formula because congruence clo- 
sure algorithm ensures that the associated graph is acyclic. 
■ 

We next turn to the problem of transforming a base for- 
mula into a quantifier-free formula. We will present two 
constructions. The first construction yields a quantifier-free 
formula in constructor-selector language and is sufficient for 
the purpose of quantifier elimination. The second construc- 
tion yields a quantifier-free formula in selector language and 
is slightly more involved; we present it to provide additional 
insight into the quantifier elimination approach to term al- 
gebras. 

We first introduce notions of covered and determined 
variables of a base formula /3. The basic idea behind these 
notions is that /3 implies a functional dependence from the 
free variables of l3 to each of the determined variables. 

In both constructions we use the notion of a a covered 
variable, which denotes a component of a term denoted by 
some free variable. In the first construction we also use the 
notion of determined variable, which includes covered vari- 
ables as well as variables constructed from covered variables 
using constructor operations / € Cons(E). 

Definition 29 Consider an arbitrary base formula (3. We 
say that an internal variable u is covered by a free variable 
X iff X = u' occurs in (3 for some u' such that u ^^* u' . An 
internal variable u is covered iff u is covered by x for some 
free variable x (in particular, if x = u occurs in f3 then u 
is covered). Let covered denote the set of covered internal 
variables of base formula, and let uncovered = U \ covered 
where U is the set of all internal variables of (3. 

Lemma 30 (Covered Base to Selector) Every base 
formula without uncovered variables is equivalent to a 
quantifier free formula in selector language. 

Proof. Consider a base formula /? where every variable is 
covered. Consider an arbitrary quantified variable u. Be- 
cause u is covered, there exists variable x free in /3 such that 
u — t{x) for some term t in the selector language. Replace 
every occurrence of u in the matrix of (3 by t{x) and elim- 
inate the quantification over u. Repeating this process for 



every variable u we obtain a quantifier-free formula equiva- 
lent to /9. ■ 

Definition 31 Let f3 be a base formula. T/ie sei determined 
of determined variables of f3 is the smallest set S that con- 
tains the set covered and satisfies the following condition: 
if u is a non-parameter node and all successors ui, . . . ,Uk 
(k > 0) of u in the associated graph are m S, then u is also 
in S. 

In particular, every constant node is determined. A param- 
eter node w is determined iff w is covered. 

Lemma 32 // a node u is not determined, then there exists 
an uncovered parameter node v such that u ^^* v. 

Proof. The proof is by induction on TC{()u). If Ti.{{)u) — 
then u has no successors, and u cannot be a constant node 
because it is not determined. Therefore, it is a parameter 
node, so we may let v = u. Assume that the statement 
holds for for every node u such that TC{{)u') — k and let 
Ti{{)u) = fc -I- 1. Because u is not determined, there exists 
a successor u' of u such that u' is not determined, so by 
induction hypothesis there exists an uncovered parameter 
node V such that u' ^->* v. Hence u ^->* u' ^^* v. m 

Lemma 33 Every base formula (3 is equivalent to a base 
formula [3' obtained from [3 by eliminating all nodes that are 
not determined. 

Proof. Construct (3' from (3 by eliminating all terms 
containing a variable u £ U \ determined and eliminating 
the corresponding existential quantifiers. Then all variables 
in j3' are determined. /3' has fewer conjuncts than /?, so 
^ /? => /3'. To show ^ /3' => /3, let (T be any assignment 
of terms to determined variables of j3 such that (3 evaluate 
to true under a. As in the proof of Lemma [25] define the 
extension a' of a as follows. Choose sufficiently large values 
0-' (v) for every uncovered sink variable v, so that a" defined 
as the unique extension of a' to the remaining undetermined 
variables assigns different terms to different variables. This 
is possible because the term model is infinite. The result- 
ing assignment a" satisfies the matrix of the base formula 
j3. Therefore, \= (3' ^ (3, so (3 and /3' are equivalent base 
formulas. ■ 



First Construction 

Proposition 34 (Base to Constructor-Selector) 

Every base formula (3 is equivalent to a quantifier-free 
formula (j) in constructor-selector language. 

Proof. By Lemma [33] we may assume that all variables 
in (3 are determined. To every variable u we assign a term 
t[u). Term r(«) is in constructor-selector language and the 
variables of t(u) are among the free variables of [3. If u G 
covered, we assign r(u) as in the proof of Lemma [30J If 
u\,...,Uk axe the successors of a determined node u, we 
put 

t{u) = /(r(ui),. . . ,T{uk)) 

where / is the label of node u. This definition uniquely 
determines t[u) for all u G determined. We obtain the 
quantifier-free formula (j) by replacing every variable u with 
t{u) and eliminating all quantifiers. 
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For every u we have \= (5 ^ u = t{u), so |= /3 => (/>. 
Conversely, if (p is satisfied then r defines an assignment for 
u variables which makes the matrix of /3 true. Therefore l3 
and (j> are equivalent. ■ 



Second Construction The reason for using constructor 
symbols / G Cons(E) in the first construction is to pre- 
serve the constraints of form u ^ v when eliminating node 
u with successors ui, . . . ,Uk. Using constructor symbols we 
would obtain the constraint f{ui, . . . ,Uk) 7^ v. Our second 
construction avoids introducing constructor operations by 
decomposing f{ui, . . . , Uk) 7^ v into disjunction of inequal- 
ities of form Ui 7^ fi{v)- When u is a parameter node, the 
presence of term fi{v) potentially requires introducing a new 
node in the associated graph, we call this process parame- 
ter expansion. Parameter expansion may increase the total 
number of nodes in the graph, but it decreases the num- 
ber of uncovered nodes, so the process of converting a base 
formula to a quantifier-free formula in the selector language 
terminates. 

Lemma 35 Let (3 be an arbitrary base formula. 

1. If u IS covered and u ^^* u then u is covered as well. 

2. If u' is uncovered and u' is not a source, then there 
exists u ^ u' such that u ^^* u' and u is also uncovered. 

3. If P contains an uncovered variable then (5 contains an 
uncovered variable that is a source. 



Proof. By dcfiiution. 



Parameter Expansion We define the operation of ex- 
panding a parameter node in a base formula as follows. Let 
P be an arbitrary base formula and w a parameter variable 
in p. The result of expansion of w i s a disjunction of base 
formulas P' generated by applying ( 13 1 to w. In each of 
the resulting formulas /3' variable w is not a parameter any 
more. Each P' contains ls/(«;) for some / G S and node w 
has successors Ui, . . . , -Ufe for k — ar(/). Each successor Ui is 
either an existing internal variable or a fresh variable. For a 
given P, sink expansion generates disjunction of formulas P' 
for every choice of / £ E and every choice of successors Ui, 
subject to congruence closure so that /?' is a base formula: 
we discard the choices of successors of w that yield formulas 
P' violating congruence of equality. (This process is simi- 
lar to converting quantifier-free formulas into disjunction of 
base formulas in the proof of Proposition 28 ) The following 



lemma shows the correctness of parameter expansion. 

Lemma 36 (Parameter expansion soundness) Let 

A — p[\/ ■ ■ ■ P'f. be the disjunction generated by parameter 
expansion of a base formula /3. Then A is equivalent to p. 

Lemma |36] justifies the use of parameter expansion in the 
following Lemma [37] 

Lemma 37 Every base formula P can be written as a dis- 
junction of base formulas without uncovered variables. 

Proof Sketch. By Lemma[33]we may assume that all vari- 
ables of P are determined. Suppose P contains an uncovered 
variable. Then by Lemma [35] P contains an uncovered vari- 
able uq such that uq is a source. Because uq is uncovered 



and determined, it is not a parameter node. We show how to 
eliminate uq without introducing new uncovered variables. 

Our goal is to eliminate uo from the associated graph. 
We need to preserve information that uq is distinct from 
variables u £ U\ {uq} in the graph. We consider two cases. 

If u is not a parameter node, then by congruence closure 
either uq and u are labelled by different function symbols, 
or they are labelled by the same function symbol f £ T, 
with ar(/) — k and there exists i, 1 < i < A; and variables 
Ui = fi{uo) and u[ = fi{u') such that Ui ^ u'^. Hence the 
constraint wo 7^ w is deducible from the inequalities of other 
variables in P and we can eliminate uo without changing the 
truth value of p. 

Next consider the case when « is a parameter node. By 
assumption u is determined, and because it is parameter, it 
is covered. We then perform parameter node expansion as 
described above. The result of elimination of uq in /3 is a 
disjunction of base formulas /?', in each P' every parameter 
node is expanded. If u is a parameter node in P then the 
constraint mq 7^ u is preserved in each P' because u is not a 
parameter node in P' so the previous argument applies. 

Because the parameter nodes being expanded are cov- 
ered, so are their successor nodes introduced by parameter 
expansion. Therefore, by repeatedly applying elimination 
of uncovered variables for every uncovered variable uq, we 
obtain a disjunction A of formulas /3' where each P' has no 
uncovered variables, and A is equivalent to /3. ■ 

Proposition 38 (Base to Selector) For every base for- 
mula P there exists an equivalent quantifier-free formula ip 
in selector language. 

Proof. By Lemma |37[ P is equivalent to a disjunction 

/3i V • ■ ■ V /3„ where each Pi has no uncovered variables. By 
Lemma [30) each Pi is equivalent to some quantifier free for- 
mula i/)i, so P is equivalent to the quantifier-free formula 

Vji V • • • V tp„. m 

The final theorem in this section summarizes quantifier elim- 
ination for term algebra. 

Theorem 39 (Term Algebra Quantifier Elimination) 

There exist algorithms A, B, C such that for a given formula 
(j> in constructor-selector language of term algebras: 

a) A produces a quantifier-free formula (j) in constructor- 
selector language 

b) B produces a quantifier-free formula (f) in selector lan- 
guage 

c) C produces a disjunction 4> of base formulas 
Proof, a): Transform formula (j) into prenex form 

Q\X\ . . . Qn^lXn~lQnXn.'t>* 



where (j>* is quantifier free, as in Section |3.1[ We eliminate 
the innermost quantifier Q„ as follows. 

Suppose first that Qn is 3. Transform the matrix (j)* into 
disjunctive normal form Ci V • • • V C„. By Proposition |28[ 
transform Ci V ■ • ■ V Cn into disjunction /3i V • ■ ■ V Pm of base 
formulas. Then propagate 3 into individual disjuncts, using 



3j;„. /3i V--- V/3„ 



(3a;„./?i) V--- V(3a:„./?,„) 
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By Proposition |27| an existentially quantified base formula 
is again a base formula, so 3x„.l3i <=> /3^ for some (3'^. We 
thus obtain the 



JlXl . 



^n-lXn-l- /9i V • ■ ■ V /3^ 



(18) 



By Proposition |34[ every base formula is equivalent to a 
quantifier-free formula in selector language, so [18] is equiva- 
lent to 

QlXl . . .Q„-iXn-l.i> 

where tp is a quantifier free formula. Hence, we have elimi- 
nated the innermost existential quantifier. 

Next consider the case when Q„ is V. Then (j> is equiva- 
lent to 

QlXi . . . Q„-lX„-l^3Xn.^(l}* 



Apply the procedure for eliminating x^ to 
is formula of form 



h^i ■ 



^n — l^n— 1 • 



.^ 



The result 



(19) 



where ^ is quantifier free. But ^-i/; is also quantifier free, so 
we have eliminated the innermost universal quantifier. By 
repeating this process we eliminate all quantifiers, yielding 
the desired formula (j)' . 

The direct construction for showing b) is analo gou s to 

To 
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a), but uses Proposition 38 in place of Proposition 
show c), apply e.g. construction a) to obtain a quantifier- 
free formula t/j and then transform ^ into disjunction of base 
formulas using Proposition |28[ ■ 

This completes our description of quantifier elimination 
for term algebras. 

We remark that there are alternative ways to define base 
formula. In particular the requirement on disequality of all 
variables is not necessary. This requirement may lead to 
unnecessary case analysis when converting a quantifier-free 
formula to disjunction of base formulas, but we believe that 
it simplifies the correctness argument. 

4 The Pair Constructor and Two Constants 

In this section we give a quantifier elimination procedure for 
structural subtyping of non-recursive types with two con- 
stant symbols and one covariant binary constructor. Two 
constants corresponds to two primitive types; one binary 
covariant constructor corresponds to the pair constructor 
for building products of types. 

The construction in this section is an introduction to 
the more general construction in Section [5] where we give a 
quantifier elimination procedure for any number of constant 
symbols and relations between them. The construction in 
this section demonstrates the interaction between the term 
and boolean algebra components of the structural subtyping. 
We therefore believe the construction captures the essence 
of the general result of Section [S] 

The basic observation behind the quantifier elimination 
procedure for two constant symbols is that the structure of 
terms in this language is isomorphic to a disjoint union of 
boolean algebras with some additional term structure con- 
necting elements from different boolean algebras. As we ar- 
gue below, the structural subtyping structure contains one 
copy of boolean algebra for every equivalence class of terms 
that have the same "shape" i.e. are same up to the constants 
in the leaves. 



Consider a signature E — {a, h, g} where a and h are 
constant symbols and g is a function symbol of arity 2. We 
define a partial order < on the set FT(E) of ground terms 
over E as the least refiexive partial order relation p satisfying 

1. aph; 

2. (siPii) A {S2pt2) => g{si,S2)Pgiti,t2). 

The structure with equality in the language {ci,b,g,<}, 
where < is interpreted as above and a, b, g are interpreted 
as free operations on term algebra corresponds to the struc- 
tural subtyping with two base types a and b and one binary 
type constructor g, with g covariant in both arguments. We 
denote this structure by BS. We proceed to show that BS 
admits quantifier elimination and is therefore decidable. 

4.1 Boolean Algebras on Equivalent Terms 

In preparation for the quantifier elimination procedure we 
define certain operations and relations on terms. We also 
establish some fundamental properties of the structure BS. 
Define a new signature Eo = {c^, g^} as an abstraction of 
signature E — {a,b,g}. Define function shapified : E -^ Eo 

by 

shapified (a) = c' 

shapified (6) — c' 

shapified((?) — g^ 

Let ar(shapified(/)) — ar(/) for each / G E; in this case c^ is 
a constant and g^ is a binary function symbol. Let FT(Eo) 
be the set of ground terms over the signature Eq. Define 
shape of a term t, as the function sh : FT(E) -^ FT(Eo), by 
letting 

Sh(/(il,...,tfc)) = 

shapified(/)(sh(ii), . . . , sh(ifc)) 
for k — ar(/). In this case we have 

sh(a) — c^ 
sh(6) = c' 
sh(g(ti,f2)) = g=(sh(ti),sh(t2)) 

Define ii ~ i2 iff sh(fi) — sh(i2). Then ~ is the smallest 
equivalence relation P such that 

1. apb; 

2. (siPii) A {S2pt2) => g{si,S2)Pgiti,t2). 

For every term t define the word tCont(i) £ {0, 1}* by letting 

tCont(a) = 
tCont(6) = 1 
tCont(/(ti,t2)) = tCont(ti) ■ tCont(t2) 

The set of all words w £ {0, 1}" is isomorphic the boolean 
algebra of B„ of all subsets of some finite sets of cardinality 
n, so we write win'u;2, wiU'W2, w'^ for operations correspond- 
ing to intersection, union, and set complement in the set of 
words w £ {0, 1}". We write wi C W2 for wi n W2 = wi. 
Define function S by 

5{t) = {sh(i),tCont(t)) 
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For term t in any language containing constant symbols, let 
tLen(t) denote the number of occurrences of constant sym- 
bols in i. If w is a sequence of elements of some set, let 
sLen(iu) denote the length of the sequence. Observe that 
sLen(tCont(i)) = tLen(t) and tLen(sh(t)) = tLen(f). More- 
over, ii ~ i2 implies sLen(tCont(ii)) = sLen(tCont(f2))- De- 
fine the set B by 

B = {{s,w) I s e FT{T,o),w G {0, l}*,tLen(s) = sLen(w)} 

Function 5 is a bijection from the set FT(E) to the set B. 
For bi,b2 G B define 61 < 62 iff ^-^(fei) < S~'^(b2). From 
the definitions it follows 

{si,it;i) < {s2,«;2) <=> Si = S2 A wi C i(;2 
If g is defined on B via isomorphism 5 we also have 

g{{si,Wi), {S2,W2)) = {g'isi, S2), Wl ■ W2) 

For any fixed s G FT(Eo), the set 

B{so) = {{s,w) £ B \ s ^ So} (20) 

is isomorphic to the boolean algebra Bn, where n — tLen(s). 
Accordingly, we introduce on each B{s) the set operations 
ti Hs t2, ti Us t2, tis- Expressions ti Hs t2 and fi Us t2 are 
defined iff sh(fi) = s and sh(f2) = s, whereas expression tl^ 
is defined iff sh(ti) = s. 

We a lso introduce cardinality expressions as in Sec- 
tion |3^ If t denotes a term, then the expression \t\s de- 
notes the number of elements of the set corresponding to t. 
Here we require s — sh(t). We use expressions \t\s = k and 
|t|i, > A: as atomic formulas for constant integer A; > 0. Note 
that 

ti<t2 ^^ sh(ti) =sh(t2) A \ti n 4\sh(t,) = (21) 

ti=f2<;=> sh(ii) =sh(i2) A 

i(tinii)u(ff nt2)|sh(ti) = 

Let sh(ti) = si, sh(i2) ~ S2, and s = g'^{si, S2). Then 

\g{U,t2)\s = \ti\s, + \t2\s2 (23) 

Equation |23] allows decomposing formulas of form 
|fl(ii,i2)|s ^ ^ into propositional combinations of formulas 
of form |ti|sj > k and |t2|s2 > k. 

Note further that the following equations hold: 

g{ti,t2)ng(t[,t'2) = g{tint[,t2nt'2) 
g{ti,t2)Ug(t[,t'2) = (7(iiUi'i,i2Ui'2) 

g{h,t2r = g{ti,m 

If E{xi , . . . ,x„) denotes an expression co nsist ing only of op- 
erations of boolean algebra, then from (4.1 1 by induction 
follows that 

E{g{tl,ti), . . .,g{tl,tl)) = g{E{t\, ..., tl), E{t\, . . . ,tl)) 

(24) 



(22) 



Equations ( 24 1 and ( 23 1 iniply 
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\E{g{tl,ti),...,g{tl,tl))\ = \E{tl,...,tl)\ + \E{tl,...,tl)\ 

(25) 
Boolean algebra _B(g^(si, S2)) is isomorphic to the product 
of boolean algebras B{si) and B(s2); the constructor g acts 
as union of disjoint sets. 



Figure 5: Operations and relations in structure FT2 



4.2 A Multisorted Logic 

To show the decidability of structure BS, we give a quantifier 
elimination procedure for an extended structure, denoted 
FT2. We use a first-order two-sorted logic with sorts term 
and shape interpreted over FT2. 

The domain of structure FT2 is FT(E) U FT(Eo) with el- 
ements FT(E) having sort term and elements FT(Eo) having 
sort shape. Variables in Var have term sort, variables in Var^ 
have shape sort. In general, if t denotes an element of FT2, 
we write t to indicate that the element has sort shape. 

Figurelslshows operations and relations in FT2 with their 
sort declarations. The signature is infinite because opera- 
tions \t\s > k and \t\s — k are parameterized by a non- 
negative integer k. 

We require all terms to be well-sorted. Functions gi and 
52 are interpreted as partial selector functions in the term 
constructor-selector language, so Dg^ = Dg^ = {{x), Is5(a;)). 
Similarly, gl and gl are partial selector functions in the 
shape constructor-selector language, so Dg| — Dgf^ = 
{{x), 1535(2;)). The expressions tif]st2 and iiUat2 are defined 
iff sh(ii) = sh(t2) = s, and t^ is defined iff sh(i) — s. We 
therefore let 

Dn, = Du, = 

((j/",xi,X2),sh(2;i) = y' A sh(2;2) = y") 

and 

D-1 = {{y',x),sh{x) =y) 

For atomic formulas \t\s > k and \t\s = k we require atomic 
fornmla sh(t) = s to ensure well-definedness: 

D\_\_=k =-D|_|_>fc = {{y\x),sh{x) = y) 

Note that the language of Figure |5] subsumes the lan- 
guage {a, fe, g, <} for the structural subtyping structure. The 
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quantifier-elimination procedure we present in Section |4.3| 
is therefore sufficient for quantifier elimination in the first- 
order logic interpreted over the structural subtyping struc- 
ture FT2. 

4.3 Quantifier Elimination for Two Constants 

We are now ready to present a quantifier elimination pro- 
cedure for the structure FT2. The quantifier elimination 
procedure is based on the quantifier elimination for term al- 
gebras of Section [3.4| as well as t he quantifier elimination for 
boolean algebras of Section |3.2[ 

We first define an auxiliary notion of a it^-term as a term 
formed starting from shape u' term variables and shape u' 
constants, using operations Ciu^, U^s, and _5J5. 

Definition 40 (u'^-terms) Let u' £ Var^ be a shape vari- 
able. The set of u^ -terms Term(it^) is the least set such that: 

1. Var C Term(u") 

2. 0„s, l„s e Term(u^) 

3. ift,t' £ Term(it^), then also 

tn„s t' e Term(M^), 

i Uus t' G Term(u^), and 

t^s G Term(u') 



Similarly to base formulas of Section [3.4| we define struc- 
tural base formulas for FT2 structure. A structural base for- 
mula contains a copy of a base formula for the shape sort 
(shapeBase), a base formula for the term sort without term 
disequalities (term Base), a formula expressing mapping of 
term variables to shape variables (horn), and cardinality con- 
straints on term parameter nodes of the term base formula 
(cardin). 

Definition 41 (Structural Base Formula) 

A structural base formula with: 

• free term variables xi, . . . , Xm', 

• internal non-parameter term variables ui, . . . , Up; 

• internal parameter term variables iip+i, . . . , Up+q; 

• free shape variables x\, . . . , aj^s/ 

• internal non-parameter shape variables u\, . . . ,Ups; 

• internal parameter shape variables Ups, . . . , Ups^gs 
is a formula of form: 

3ui, . . . ,Un,u\, . . . ,U^s. 

shapeBase(iti, . . . , it^s, x\, . . . , x^^s) A 
termBase(«i, . . . ,Un,xi, . . . ,Xm) A 
hom(Mi, . . . ,u„,u\,. . . ,it^s) A 
cardin(up+i, . . . ,u„,Ups^i, . . . ,u% 



1 7 • • • ! ^n^ 



where n = p -\- q, rf' = p^ -\- q^ , and formulas shapeBase, 
termBase, horn, and cardin are defined as follows. 

shapeBase(iti, . . . , u^„s, x\, . . . , xl^s) = 



/\ u1 ^ ti{u\, . . . , ul,) A /\ x] ^ Uj. 

1=1 i = l 



where each ti is a shape term of form f{ul^, . . . ,ul^) for 
some f e So, k = ar(/), andj : {1, . . . , m^} ^ {1, . . . , n^} is 
a function mapping indices of free shape variables to indices 
of internal shape variables. 

termBase(?ii,. . . ,u„, a^i, . . . ,Xm) = 

p m 

/\ Ui =ti{ui,. . . ,U„) A l\Xi^Uj. 



where each ti is a term of form f{uij^, . . . ,Uii_) for some 
f £ "S, k — ar(/), andj : {l,...,m} -^ {l,...,n} is a 
function mapping indices of free term variables to indices of 
internal term variables. 

n 

hom(ui, . . . ,Un,u\, . . . ,u^s) — /\ sh{ui) — Uj. 

i = l 

where j : {1, . . . , n} —> {1, . . . , n^} is some function such 
that {ji,. . . ,jp} C {1, . . . ,p"} and {jp+i, . . . ,ip+q} C {p= -|- 
1, . . . , p'^ -f q^} (a term variable is a parameter variable iff its 
shape is a parameter shape variable). 

cardin(wp+i,. . . ,Up+q,Ups^i, ... ,Ups^qs) = Vi A • ■ • A i/'r 

where each ^pi is of form 

|t(Mp+l, . . . ,Up+q)\u= = k 

or 

|t(«p+l,. . . ,«p+g)Us > k 

for some u^-term t(up+i, . . . , Up+g) that contains no vari- 
ables other than some of the variables Up+i, . . . ,Up+q, and 
the following condition holds: 

If a variable Up+j occurs in term 
t{up+i,. . . ,Up+q), then sh(up+j) = u^ (26) 

occurs in formula horn . 

We require each structural base formula to satisfy the 
following conditions: 

PO) the graph associated with shape base formula 

3ui, . . . , M^s. shapeBase(ui, . . . , u„s, a;i, . . . , a::^„s) 



is acyclic (compare to Definition 21 ) 



PI) congruence closure property for shapeBase subformula: 
there are no two distinct variables u] and itj such that 
both ul = f{ul^ ,...,u]J and u] = f{u]^ ,...,u]^) occur 
as conjuncts in formula shapeBase; 

P2) congruence closure property for termBase subformula: 
there are no two distinct variables Ui and Uj such that 
both Ui = f{uii ,... ,ui^) and Uj = f{ui^ ,... ,ui^) occur 
as conjuncts in formula termBase; 

P3) homomorphism property of sh: for every non-parameter 
term variable u such that u = f{uij^, . . . ,Ui^) occurs 
in termBase, if the conjunct sh{u) — u^ occurs in 
hom, then for some shape variables iIj^,...,m5^ the 
term u^ — /^(m^^^, . . . , u^^) occurs m shapeBase where 
f = shapified(/) and for every r where 1 < r < k, 
conjunct sb{ui^) = Uj^ occurs in hom. 



A distinct(«" 



li • • • ) "-n,; 
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According to Definition [41] a structural base formula con- 
tains no selector function symbols. Formulation using se- 
lector symbols is also possible, as in Definition |20| The 
only partial function symbols occurring in a structural base 
formula of Definition 1411 are in cardin subformula. Condi- 



tion (26 1 therefore ensures that functions in cardin and thus 



the entne base formula are well-defined. 

Note that acyclicity of shape base formula shapeBase 
(condition PO) implies acyclicity of term base formula as 
well. Namely, condition P3 ensures that any cycle in 
termBase implies a cycle in shapeBase. 

As in Section [374l we proceed to show that each quantifier- 
free formula can be written as a disjunction of base formulas 
and each base formula can be written as a quantifier-free 
formula. 

We strongly encourage the reader to study the following 
example because it illustrates the idea behind our quantifier- 
elimination decision procedure. 

Example 42 The following sentence is true in structure 
FTa. 

Va;,y. x < y ^ 

3z. z<x/\z<y/\ 

\fw. w<xAw<y^ 

Vu. g{v,z) <g{z,v) A \Sg{v) A \Sg{w) =>gi(w) < gi{v) 

^ (27) 

An informal proof of sentence (271 is as follows. Suppose 
that X < y. Then sh(a::) = sh(j/) = a;^ Let z = x n^^s y. 
Now consider some w such that w < x and w < y. Then 
sh(«;) — x^, so w < z. Suppose that v is such that g{v, z) < 
g{z, v). Then by covariance of g we have z <v, so w < v. If 
we assume lsg(i(;) and Is9(u), then giiw) and gi{v) are well 
defined and by covariance of g we conclude giiyo) < gi{v), 
as desired. 

We n ow g ive an alternative argument that shows that 
sentence ( |27[ ) is true. This alternative argument illustrates 
the idea behind our quantifier-elimination decision proce- 
dure. For the sake of brevity we perform some additional 
simplifications along the way that are not part of the pro- 
cedure we present (although they could be incorporated to 
improve efficiency), and we skip consideration of some un- 
interesting cases during the case analyses. 

Let us first eliminate the quantifier from formula 

yv. g{v,z) < g{z,v) A\sg{v) A\sg{w) ^ gi{w) < gi{v) (28) 



Formula (28 1 is equivalent to -^3v.(f>i where 



<t>i =9{v,z) < g{z,v) A \sg{v) A \sg{w) A ^{gi{w) < giiv)) 
^ (29) 

We next use (pTj) to eliminate atomic formulas fi < f2 and 
replace them with cardinality constraints, resulting in for- 
mula (/!>2 equivalent to 01 : 



where 



4>2 = 4>2,1 A 4>2,2 

\g{v, z) n g{z, t))''lsh(9(i,,z)) = A 
sh(5(«,2)) =sh{g(z,v)) A 
\sg{v) A\sg{w) 




Figure 6: One of the Base Formulas Resulting from ([28| 



and 



^ (151(1") n 51 (")1sh{9i(i.)) =0 A sh(5i(it;)) = sh(gi(u))) 

(31) 
Here we have written e.g. 

\g{v, z) n g{z, 'y)''|sh(9(t,,z)) = 
as a shorthand for 

\g[v, z) nsh(9(t,,z)) g{z, ^)sh{9{i,,z))U(9(i,,z)) = 

(In general, we omit term shape arguments for boolean alge- 
bra operations if the arguments are identical to the enclosing 
term shape argument of the cardinality constraint.) 

We next transform (j)2 into disjunction of well-defined 
conjunctions. Following the ideas in PropositionlS] we trans- 
form 02,2 into 03,1 V 03,2 where 



»3,1 



\gi(w) n gi{v)%^g^(^)) > I A sh(gi(w)) =sh(5i(u)) 



(30) 



(32) 
and 

03,2 = sh(3i(u>)) /sh(5i(u)) 

and then transform 02, i A 02.2 into 

(02,1 A 03, l) V (02,1 A 03,2) 

For the sake of brevity we ignore the case 02, i A 03,2; it is 
possible to show that 02, i A 03,2 is equivalent to false in the 
context of the entire formula. 

We transform 02, i A03,i into unnested form, introducing 
fresh existentially quantified variables Uvz,Uzv,Uwi,Uvi, w^z, 
w^i that denote terms occurring in 02, i A 03, i. The result 
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is formula <^4 where 

Uvz = ff (u, z) A Itzu = ^(z, v) A 

ul^ = sh(iti,z) A M^i = sh(ii„,i) A 
sh(«z„) = ul^ A sh(u„i) = u?„i A 
\Sg{v) A lsg(i(;) A 

\uvz n Uz„|us_^ = A |u„i n mJiIus ^ > 1 

(33) 
To transform 04 into disjunction of structural base formulas 
we keep introducing new existentially quantified variables 
and adding derived conjuncts to satisfy the invariants of 
Definition |4T] 

Because Isg(w) and Isg(ii)) appear in the conjunct, we 
give names to the remaining successors of v, w, by intro- 
ducing Uw2 ~ 52 (w), Uii2 = <72(w)- We may now write 
the constraints in constructor language, using e.g. conjunct 
V — g{uvi,Uy2) instead of 

\Sg{v) A Uvi = gi{v) A Uy2 ^ g2{v) 

To ensure that every term variable has an associated shape 
variable, we introduce fresh variables ul, u^, u|, m^2i ^v2 
with conjuncts Uy = sh(w), ii^ = sh(w), u% = sh{z), u^2 ~ 
sh(u„2), <2 =sh(u„2)- 

Note that base formula contains distinct(iti, . . . , u^) sub- 
formula. In the case when the current conjunction is not 
strong enough to entail the disequality between shape vari- 
ables ul and Uj, we perform case analysis, considering the 
case ul = Uj (then vf^ can be replaced by -Uj), and the case 
Wi ^ Uj . This case analysis will lead to a disjunction of struc- 
tural base formulas (unless some of the formulas is shown 
contradictory in the transformation process). In contrast to 
shape variables, we do not not perform case analysis for dis- 
equality of term variables, because term Base in Definition |41| 
does not contain a distinct subformula. 

In this example we perform case analysis on whether 
Uw ~ '^z and u^ = ii^ should hold. For the sake of the exam- 
ple let us consider the case when u^ = m^ = u^, w^2 = '^w2 



We next apply rule 1 25 1 to reduce all cardinality con- 



and u^,,,w' 



U2) "-uj) ^w\^ ^w2 



u^2 are all distinct. In that case shape 



variables w^,wi,ti^ denote the same shape, so let us replace 
e.g. mJ and u^ with u^. Similarly, we replace m^2 with «^2- 
We obtain conjuncts sh(i;) = u^, sh(2) = «^, sh(it„2) = ■"^2- 
We next ensure homomorphism property P3 in Defini- 
From conjuncts u^z ~ g{y,z), sh(ii„2) = u%z, and 
u^, we conclude 



tion 
sh(i;J 
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u%^ = sh(u„z) = 

sh(3(t',z)) = 
5^(sh(i;),sh(z)) = 

g\tAv,ul,) 

so we add the conjunct u%^ — g^{ul^,ul^) to the formula. 
Similarly, from w = 5(wiui,«m2), sh(ui) — u^, sh{un,i) — 
M^i, sh(wiu2) = Uw2 we conclude u^ = g{un,i,Uw2) and add 
this conjunct to the formula. Adding these two conjuncts 
makes property P3 hold. (Note that, had we decided to 
consider the case where sh(ti) 7^ sh(2) we would have arrived 
at a contradiction due to sh(u^z) = sh(u|„).) 



straints into cardinality constraints on parameter nodes 
(nodes u for which there there is no conjunct of form 

u — f{ui-^, . . . ,Ui^)). We replace |u„2 n u'iy\u%^ = with 



\uv n iizlus^ = A |mz n uSUi„ ~ 



(34) 



Variable « is a para mete r variable, but z is not, which pre- 
vents application of (25 1. We therefore introduce u^i and 



Uz2 such that z = g{u^i, ^^2)- Because s\\{z) = u^, we have 
sh(M2i) = u%^i and sh{uz2) = u\^2 by hom omo rph ism prop- 
erty. We can now continue applying rule (25 I to (34 1. The 
result is: 

1^1,1 n MziU^^i = A l^zi n "SiU^^i = A 
\uv2 n Mz2U»„2 = A 1^22 n <2U»„2 = 

To make the formula conform to Definition |4l] we introduce 
internal variables ■«„ , Wz , Mm corresponding to free variables 
V, z, w, respectively. The resulting structural base formula 



31*1)2 , Uzv ,Uv,Uz,Uw, U^l , Uy2 , W2I , M22 , Uiul , Wtu2 , 

shapeBasBj^ AtermBasei A 
homi A cardini 

where 
shapeBasej = ul^ = /(u^,, m^) A ■«?„ =/(u^,i,m^2) A 
distinct«z, m'^,, M^i, '""„2) 
termBasei = u^^ = g{uv,u^) A Mz„ — g{uz, u^) A 
Ml, = g{uvi,Uv2) f\Uz = g{uzi,Uz2) A 

Un, = g(w„i,M™2) A 

V — Uv /\ Z — Uz /\ W — Uw 

horrii = 

sh{uvz) — mL a sh(Mz„) = M^z A 

sh(M„) = u^ A sh(Mz) — u%^ /\ sh(uu,) = u^ A 

sh(M„i) = it^i A sh(u2i) = it^i A sh(M™i) = u^i A 

sh(Ui,2) = '"L2 A Sh(u22) = -"^.2 A sh(Mu,2) = u\,2 

cardini = iw^i n Mzi|„^^,^ = A jttzi n <i|.us^^ = A 

Mi)2 n Mz2U^ = A |«22 n <2U» 2=0^ 



(35) 



\uwi n M„i 



> 1 



Figure [6] shows a graph representation of the subformulas 
shapeBasBj, termBasei, and homi of the resulting structural 
base formula. 

Recall that we are eliminating the quantification over v 
from Sv.(j)\. We can now existentially quantify over v. As 
in Proposition |27| we simply remove the conjunct v — u^ 
from term Base and the quantifier 3w. 

As in Figure |4] of Section |3.4| the structural base for- 
mula form allows us to eliminate an existential quantifier, 
whereas the quantifier-free form allows us to elimina te a 
negation. We transform the structural base formula ( 35 I 
into a quantifier- free formula as follows. 
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We first use rule ([7| to eliminate variable u„z, replacing 
it with g{y,z). In the resulting formula g{v,z) occurs only 
in homi in the form 



5h{g{uv,U:,)) = ul^ 



(36) 



But (361 is a consequence of conjuncts u^^ =_a^(u^,u^), 
sh(ii„j = u^ and sh(ii„) = u^, so we omit ( |36| l from the 
formula. In analogous way we eliminate variable u^v and 
the conjuncts that contain it. We also eliminate u„, anal- 
ogously to Uvz and u^v In the resulting formula u^^ oc- 
curs only in distinct subformula of shapeBase. Conjuncts 
Wuz 7^ w^i tAjz 7^ ^Lii s-nd it^2 7^ ^Li2 follow from the re- 
maining conjuncts in shapeBase by acyclicity. Hence we may 
replace d\st\nct{ul^,ulj,ulji,ul2) by distinct(u^,it^i, u^a)- 
Now ul^ does not occur in the matrix of the formula, so we 
may eliminate BiiJ^ altogether. 
The resulting formula is: 



■"L = ff'« 



A distinct(it^,it^i, M^ 



Uz = g{Uzl,Uz2) A Wu, = g(u„i,llu,2) A 

z = Uz Aw = u^u A 

sh{uz) = M^ A s\r\{um) = «^, A 

sh{uyi) = u^i A sh{uzi) = u^i A sh(uji,i) = u^i A 

sh(u„2) = ^^2 A sh(M22) = "1,2 A Sh(«ii,2) = U^2 A 

jiiui n "ziU»^„j = A 1^21 n "Si|<,i = A 

|lti,2 n "z2U»„2 = A |Uz2 n "S2|u=„2 = A 

luiui n M^iU* ;^ > 1 

(37) 
We next eliminate u^i. It suffices to eliminate it from con- 
juncts where it occurs, so we consider formula 05, i: 

(/>5,1 = 3u^i. 
sh{uvi) = M^i A sh{uzi) = M^i A sh(uu,i) = w^i A 

l^ul n^ziliis^^ =0 A \Uzl n U'i,i\nl^-^ =0A 

|u,„i nu5i|„s ^ > 1 

(38) 
Note that all variables from 05, i belon g to B{s) where s 
is the value of shape variable tt^i (see ( |20[ )). This means 
that we can apply quantifier elimination for boolean algebra 
(Section 3.2 1 to eliminate u„i. The result is 



(39) 



05,2 = sh{uzi) = ul„i A sh{uiui) ^ ul^i A 

|u»i nuzilijs^^ > 1 

Similarly, to eliminate u„2 we consider formula 05,3 : 

05,3 = 3Uv2- 
Sh(«„2) = U%,2 A sh{Uz2) = -"1,2 A sh(Uu,2) = 11^2 A 
\Uv2 n MrfU= 2 = A \Uz2 n mS2U^ 2 = 

(40) 
The result of boolean algebra quantifier elimination on 05,3 
is true (indeed, one may let Uv2 ~ 1*22 )• The resulting base 



(41) 



(42) 



formula with Uvi and u„2 eliminated is 06 : 

06 = 3Uz,Ujn,Uzl,Uz2,Uwl,Ujn2,ul^,ul,i,U%2- 

ul, = g''{ul,i,ul,2) A distinct(u'„,u'„i,?i^2) A 

Uz ^ g{Uzl,Uz2) A Un, = g{Uu,l,Un,2) A 

z = Uz Aw — Uw A 
sh(uz) = u\^ A sh(itiu) — u%, A 
sh(u2i) = it'^,! A sh(u„i) = u'^,1 A 
sh(u22) = 111,2 A sh(u„2) = u%j2 a 

\uvji n ^ziUj„i > 1 

Observe that the equalities in 06 are sufficient to express all 
variables bound in 06 in terms of free variables (all internal 
variables are "covered"): 

u^ — z u^ — w 

Uzl ^ gi{z) Uz2 =52(2) 

Un,l^gi{w) Un,2^g2{w) 

u\^ — sh(ui) 
i^Li = ffi(sh(u))) ul,2 = gl{sh{w)) 

Structural base formula 06 is therefore equivalent to the 
quantifier- free formula 07, 1: 

07,1 = lsgs(sh(w)) A Is9(w) A lsg(2;) A 

distinct(/i(sh(u;)), g|(sh(w))) 

sh(2)=sh(u;) A |ffi(TO) n gi(2)''|g|(sh{«,)) > 1 

(43) 
When transforming formula 04 we chose the case u^i 7^ mLi- 
If we choose the case m^i = u?„2, we obtain quantifier-free 
formula 07,2: 

07,2 = Is95(sh(u;)) A Is<,(k;) A 183(2) A 

sh(2) = sh(w) A gl{sh{w)) = g|(sh(iu)) A 

\gi{w)ngi{z)''\gi.^^,u(^y) > 1 

(44) 
Our quantifier elimination would also consider the case 
sh{g2{w)) 7^ sh(g2{z)). The procedure finds the case con- 
tradictory in a larger context, when eliminating 3z, because 
sh(2:) = sh(a;) — sh{w) follows from z < x and w < x. Ig- 
noring this case, we observe that 07, 1 V 07,2 is equivalent to 
the quantifier-free formula 08, where 

08 = lsgs(sli(m)) A Is9(to) A Isg(z) A 

sh(z) = sh(w) A lffi(«^) ngi(2;)'=lg|(sh(t„)) > 1 

(45) 
Let us therefore assume that the result of quantifier elimi- 
nation in (28 1 is ^08. 



We proceed to eliminate the next quantifier, \fw, from 
Viu. w < X Aw < y ^ ^08 (46) 

^6| is equivalent to 

^Bw. w<xAw<yA(j)s 
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After eliminating < we obtain 

-^3w. \w f] x'^\sh(w) ~ A sh(a;) = sh(w) A 

\w n y^Uhin,) = A sh(y) = sh{w) A 

lsgs(sh(w)) A \sg{w) A ISg(2) A 

sh(2;) = sh(-u;) A \gi{w) f] gi{zy\g^^h(w)) > i 

(47) 
We now proceed similarly as in eliminating variable v. The 
result is ^^g where 

09 = sh(a;) = sh(2) A sh(y) = sU(z) A 

\Sg{x) A \Sg{y) A \Sg{z) A ls<,s(sh(2)) A (48) 

\gi{x) n gi{y) n gi{zy\gs^^^h{zy) > 1 

The remaining quantifiers that bind z, y, and x are elimi- 
nated similarly. 

To eliminate the quantifier 3z, we need to transform ^^g 
into disjunction of base formulas. This transformation re- 
quires negation of 09 and creates several disjuncts. We con- 
sider only the two cases, ^lo and (j)ii, that are not contradic- 
tory in the enclosing context of conjuncts z < x and z < y: 

010 = sh(a::) = sh(2) A sh{y) ^ sh{z) A ^lsgs(sh(2:)) 

(49) 
011 = sh(a;) = s\r\{z) A sh(y) = s\r\{z) A 

lsg(a;) A \sg{y) A 189(2) A Is9s(sh(2)) A (50) 

131(2;) n gi{y) n ffi(2)'=|gs(sh(z)) = 
010 is equivalent to 

sh(2;) =c'Ash(y) = c'Ash(2) =c' (51) 

The result of eliminating 3z from 

32. |2na;''jsh(z) = A I2 n y^jshCz) = A 0io 
is therefore 

010,2 = sh(x)=sh(i/) A ^Is95(sh(a;)) (52) 

The result of eliminating 3z from 

32. |2na;'=jsh(3) =0 A \z n y'\,u(z) ^ A 0ii 

is 

011,2 = sh(a;) = sh(y) A lsgs(sh(j:)) 

010,2 V 011,2 is equivalent to sh(2;) — sh(y). Converting 
\x n /ish(i) = A sh(x) = sh{y) => sh{x) = sh(y) 



Example 43 Consider the formula 

06 = 06 A Uz / U^ 



to structural base formula yields true. We conclude that ( 27 1 
is a true sentence in the structure FT2, which completes our 
quantifier elimination procedure example. 



Formulas in the Example |42| do not contain disequalities be- 
tween terms variables, only disequalities between shape vari- 
ables. If a conjunction contains disequalities betwee n te rm 
variables, we eliminate the disequalities using rule | |22[ ) in 
the process of converting formula to disjunction of struc- 
tural base formulas. The following Example [43] illustrates 
this process. 



Where 06 is given by plf. By ((22|), literal u^ 7^ u„ is 

(53) 



equivalent to -01 V -02 where 

ipi = sh(uz) / sh(M„,) 
and 

%p2 = sh{uz) — sh{uiu) A 



|(mz n u^) u {ul n Mm)jsh(u,) > 1 



(54) 



In this case, formula 06A'0i is contradictory. Formula 06Ai/'2 
is equivalent to 0g where 



p6 = 3Uz^U^^Uzl,Uz2,Uwl^U^2^Uw^'^wl:'^w2- 

mL =5'(mLi,wL2) a distinct(u^,-u^i,u^2) a 

Uz ^ g{Uzl,Uz2) A Un, = g{Uw\,Un,2) f\ 
Z = Uz A W = Un, A 

s\:\{uz) = ul, A sh(itiu) — ul, A 
sh(uzi) = it^i A sh(uu,i) = u^i A 
sU{Uz2) = Uw2 A sh(u„2) = U^2 A 

\uwi n ^ziUj^,! > 1 A 

\{uz n uS,) u (u"; n uu_,)\ui, > 1 



(55) 



As in Example 42 we now apply rule ( 25 I to 



\{uz n uZ,) u (?iz n u^)\ui^ > 1 

and transform 06 into a disjunction of base formulas. 



We proceed to sketch the general case of quantifier elimina- 
tion. Th e following Proposition |44| is analogous to Proposi- 
tion |27| the proof is again straightforward. 

Proposition 44 (Quantification of Structural Base) 

If P is a structural base formula and x a free term vari- 
able in j3, then there exists a base structural formula /3i 
equivalent to 3x.p. 

The following Proposition|45|corresponds to Proposition|28[ 



Proposition 45 (Quantifier-Pree to Structural Base) 

Every well-defined quantifier-free formula in the language 
of Figurel^can be written as true, false, or a disjunction of 
structural base formulas. 

Proof Sketch. Let be a well-defined quantifier-free 
formula in the language of Figure [5] 

We first use rule (21 1 to eliminate occurrences of < in 
the formula replacing them with cardinality constraints. 

We then convert formula into disjunction 0i V • ■ ■ V 0„ of 
well-formed conjunctions of literals. We next describe how 
to transform each conjunction 0i into a disjunction of base 
formulas. 

Let 0i be a conjunction of literals. Using the technique 
of Proposition [9l we convert the formula to unnested form, 
adding existential quantifiers. We then eliminate unnested 
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unnested form 


cardinality constraint 


X = Xifls X2 


\x+{xi nX2)\s =0 


X ^ XlUs X2 


\x + {xi U2;2)|s =0 


X =x1^ 


\x + x1\s = 



Figure 7; Elimination of Boolean Algebra Unnested Formu- 
las . Expression x + y is a, shorthand for {x n y'^) U (j/ n x"). 



conjuncts that contain boolean algebra operations, accord- 
ing to Figure (tI The only atomic formulas in the resulting 
existentially quantified conjunction are of form x = a, x — b, 

X = g{xi,X2), \Sg{x), Xi = gi{x), X2 = 52(2;), Xi = X2, 

x^ = c", 2;' = g^{x\,xl), Is9s(a::'), x\ = glix'), xl = 5|(x'), 
x\ = X2, x^ = sh(a;), as well as \ti\xs > k and \t2\x^ ~ k for 
some i^-terms ii and t2. The only negated atomic formulas 



are of form xi j^ X2, 



X2, 



nlsg(x) and 



nlsgs(a;^). As in 



the proof of Proposition |28[ we use ( |17[ ) to eliminate -ilsg(a::) 
and -'lsgs(a;^). This process leaves formulas of form xi 7^ X2 
and x\ ^ X2 as the only negated atomic formulas. 

In the sequel, whenever we perform case analysis and 
generate a disjunction of conjunctions, existential quantifiers 
propagate to the conjunctions, so we keep working with a 
existentially quantified conjunction. The existentially quan- 
tified variables will become internal variables of a structural 
base formula. 

We next convert conjuncts that contain only term vari- 
ables to a base formula, and convert shape part to base 
formula, as in the proof of Proposition |28[ We simultane- 
ously make sure every term variable has an associated shape 
variable, introducing new shape variables if needed. (This 
process is interleaved with conversion to base formula, to en- 
sure that there is always a conjunct stating that newly intro- 
duced shape variables are distinct.) We also ensure homo- 
morphism requirement by replacing internal variables when 
we entail their equality. Another condition we ensure is that 
parameter term variables map to parameter shape variables, 
and non-parameter term variables to non-parameter shape 
variables; we do this by performing expansion of term and 
shape vari able s. We perform expansion of shape variables as 
in Section |3.2| Expansion of term variables is even simpler 
because there is no need to do case analysis on equality of 
term variable with other variables. 

The resulting existentially quantified conjunction might 
contain disequalities u ^ u' between term variables. We 
eliminate these disequalities as explained in Example |43| 
by con ver ting each disequality into a cardinality constraint 
using (22 1. In general, we need to consider the case when 
sh(u) ^ sh{u') and generate another disjunct. 

Elimination of disequalities might violate previously es- 
tablished homomorphism invariants, so we may need to 
reestablish these invariants by repeating the previously de- 
scribed steps. The overall process terminates because we 
never introduce new inequalities between term variables. 

As a final step, we convert all cardinality constraints into 
constraints on parameter term variables, using (251. In the 
case when the shape of cardinality constraint is c , we can- 
not apply (25 1. However, in that case the homomorphism 



condition ensures that each of the participating variables is 
equal to a or equal to b. This means that we can simply 
evaluate the cardinality constraint in the boolean algebra 
{a,b}. If the result is true we simply drop the constraint. 



otherwise the entire base formula becomes false. 

This completes our sketch of transforming a quantifier- 
free formula into disjunction of structural base formulas. ■ 

We introduce the notion of covered variables in structural 
base formula by generalizing Definition |29| 

Definition 46 The set covering of variable coverings of a 
structural base formula (5 is the least set S of pairs {u, t) 
where u is an internal (shape or term) variable and t is a 
term over the free variables of (3, such such that: 

1. if X — u occurs in termBase then {u,x) £ S; 

2. if x'^ — u^ occurs in shapeBase then {u^,x^) £ S; 

3. if {u,t) £ 5* and u = /(«!, ■ ■ ■ ,Uk) occurs in termBase 
for some / G E then {{u^, fi{t)) , . . . , {uk, fkit)}} C S; 

4-. if {u^,t^) G S and u^ = /^(u|, . . . , ii|) occurs in 
ShapeBase then {{u\,fl{f)),. . . , {ul,fl{t^))} C S; 

5. if {u,t) £ S and sh{u) — u^ occurs in horn then 
{M^sh(i)) e S. 

Definition 47 An internal term variable u is covered iff 
there exists a term t such that {u,t) G S. An internal shape 
variable u^ is covered iff there exists a term t'^ such that 
K,t=) GS". 

Lemma 48 Let 13 be a structural base formula with matrix 
Po and let covering be the covering of (3. 

1. If {u, t) e S then \= f3o ^ u^t. 

2. If {u', f) G S then \^ /3o ^ u' ^ f. 
Proof. By induction, using Defirntion |46[ ■ 

Corollary 49 Let (3 be a structural base formula such that 
every internal variable is covered. Then (3 is equivalent to a 
well-defined quantifier-free formula. 



Proof. By Lemma 48 using n\ 



Lemma 50 Let u be an uncovered non-parameter term 
variable in a structural base formula (3 such that u is a source 
i.e. no conjunct of form 



u = f{ui, 



,'Uk) 



occurs in termBase. Let (3' be the result of dropping u from 
/3. Then (3 is equivalent to (3 . 

Proof. Let u occur in termBase in form 

U = /(ill, ■ . ■ ,Uk) 

The only other occurrence of u in /? is in hom and has the 
form sh(«) = u^ . Because non-parameter term variables 
are mapped to non-parameter shape variables, shapeBase 
contains formula 



u — shapified(/)(iti. 



,Wfej 



(56) 



where u\, . . . ,u), are such that, by homomorphism property, 
sh(Mi) = u^ occurs in hom. This means that the conjunct 
sh(M) = w^ is a consequence of the remaining conjuncts, so it 
may be omitted. After that, applying ([7| yields a structural 
base formula (3' not containing u, where /?' is equivalent to 
13. , 
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Corollary 51 Every base formula is equivalent to a base 
formula without uncovered non-parameter term variables. 

Proof. If a structural base formula has an uncovered 
non-parameter term variable, then it has an uncovered non- 
parameter term varia ble that is a source. By repeated ap- 
plication of Lemma | |50[ ) we eliminate all uncovered non- 
parameter term variables. ■ 

The next example illustrates how we deal with cardinal- 
ity constraints \ls\s > k and |lsj — k, which contain no 
term variables. These constraints restrict the size of shape 
s. Luckily, we can be translate them into shape base formula 
constraints. 

Example 52 (Shape Term Size Constraints) 

Let X < y denote conjunction x < y/\x 7^ y. Let us eliminate 
quantifiers from formula 3x.(j){x) where 

4>{x) = -^{3y3z. x <y ^y < z) h (57) 

-^{3u. u < x) 

Eliminating variables y, z from the first conjunct and vari- 
able u from the second conjunct yields 

^|a;''lsh(3:) > 2 A^ja;jsh(x) > 1 

which is equivalent to 

(|a;1sh(a:) = V la^^lshc^;) = 1) A |a;|sh(j;) = 

and further to disjunction 

{\'X%K^) = A |a;|sh(x) = 0) V {\x%^i^^) = 1 A ja::jsh{^) = 0) 

The first disjunct can be shown contradictory. Let us trans- 
form the second disjunct into a structural base formula. Af- 
ter introducing u — x and u^ = sh(M), we obtain 

Bit, u^. X = u A sh(ii) = u^ A \u\us = A lit'^jus = 1 

Then 3x.(j>{x) is equivalent to 

3u, u^. sh(it) — u^ A \u\u^ = A |m''|„s — 1 

Eliminating parameter term variable u yields 

3u. |l|„s = 1 

Constraint |l|us = 1 means that the largest set in the 
boolean algebra B{s) where s is the value of u^ has size 
one. There exists exactly one boolean algebra of size one 
in the structure FT2, namely {a.,b}. Therefore, |l|„s = 1 is 
equivalent to u^ — c^. We may now eliminate u^ by letting 
u^ = c^. We conclude that the sentence 3x.(j}{x) is true. 

Notice that we have also established that formula (t>{x) 
is equivalent to sh{x) — c^, as a consequence of 

|lsh{a;)lsh(i) = 1 



The following Proposition [53] corresponds to Proposi- 
tion [38l 

Proposition 53 (Struct. Base to Quantifier-Free) 

Every structural base formula (5 is equivalent to a quantifier- 
free formula (p in the language of Figure[^ 



Proof Slcetch. By Corollary [51] we may assume that (3 
has no uncovered non-parameter term variables. By Corol- 
lary |49] we are done if there are no uncovered variables, so 
it suffices to eliminate uncovered parameter term variables 
and uncovered shape variables. 

Let u be an uncovered parameter term variable. Then u 
does not occur in termBase. Indeed, suppose for the sake of 
contradiction that u occurs in termBase in some formula 



fiui, 



,Uk 



Then u' is an uncovered non-parameter variable in /3, which 
is a contradiction because we have assumed f3 has no uncov- 
ered non-parameter variables. Therefore, u does no occur in 
termBase, it occurs only in hom and cardin. Let sh(u) = u^ 
occur in hom. Let tpi , . . . , tl^p be all conjuncts of cardin that 
contain u. Each ^i is of form \ti\u' > ki or \ti\us = ki for 



some tt^-term ti . Let Uj-^ , 



'■^Og 



be all term variables ap- 



pearing in ti terms other than u. Conjunct sh{uj^) = u^ 
occurs in hom for each r where 1 < r < q. The base formula 
can therefore be written in form 



Pi = 3x1, 



where 



3m. 



sh{u) - 
Muji) 
ipi A. 



M= A 
= u' A . 
.A Vp 



<^A(/>i 



A sh(ujj = m' A (58) 



All term variables in ipi, . . . ,tljk range over terms of shape 
u'. Therefore, (jti defines a relation in the boolean alg ebra 
_B(|[m^]). This allows us to apply construction in Section [3.2[ 
We eliminate u from Tpi A. . .A tpp and obtain a propositional 
combination ipo of cardinality constraints with M^-terms. cf>o 
does not contain variable u. We may assume that -i/jq is in 
disjunctive normal form 



V'o 



Ql V 



Va„ 



Let 



sh{uj-^) = u'^ A... A sh{uj^) = u^ A 



for 1 < i < w. Base formula /3i is equivalent to disjunction 
of base formulas /3i,i where 



/3i 



3a;i, 



)2;i, 



,x}. (j) A 01,, 



We have thus eliminated an uncovered parameter term vari- 
able u from /3i. By repeating this process we eliminate all 
uncovered parameter term variables from a base formula. 
The resulting formula contains no uncovered term variables. 

It remains to eliminate uncovered shape variables. This 
process is similar to term algebra quantifier elimination in 
Section [3. 4[ A n essential part of construction in Section [3. 4| 
is Lemma |25| which relies on the fact that uncovered pa- 
rameter variables may take on infinitely many values. We 
therefore ensure that uncovered parameter shape variables 
are not constrained by term variables through conjuncts out- 
side shapeBase. 

Suppose that u'^ is an uncovered parameter shape vari- 
able in a base formula /?. u^ does not occur in termBase. 
u^ does not occur in hom either, because all term variables 
are covered, and a conjunct sh(it) = u'^ would imply that u^ 
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is covered. The only possible occurrence of u^ is in cardi- 
nality constraint ip of subformula cardin, where tp is of form 
\t\u' = fc or of form jijus > k. Suppose there is some term 
variable u occurring in t. Then sh(M) — u^ so u^ is covered, 
which is a contradiction. Therefore, t has no variables, t 
can thus be simplified to either Ous or lus. In general, a con- 
straint of form |l|us = fc or |l|us > A: is a domain cardi nal ity 
constraint for boolean algebra B{ 



of base formulas without uncovered variables. Let (3 ' be the 



(see Remark 15 



well as ( |20[ )). A constraint containing |Ous| is equivalent to 
true or false. A constraint |l|us = is equivalent to false. A 
constraint |l|„s = k for A; > 1 is equivalent to 



u = t\ V 



V u — t„ 



where i^ , . . . , tp is the list of all ground terms in signature Eo 
that have exactly k occurrences of constant c^. We therefore 
generate a disjunction of base formulas /3i, . . . , /3p where (3i 
results from (3 by replacing jllus = k with u' — t]. We con- 
vert each l3i to a disjunction of base formulas by labelling 
subterms of ti by internal shape variables and doing case 
analysis on the equality between new internal shape vari- 
ables to ensure the invariants of a base formula. The result 
is a disjunction of base formulas where variable u^ occurs 
only in shapeBase subformula. 

Similarly, |l|us > k + 1 is equivalent to ^(jllus = k) and 
thus to 

u j^t\A ■■■ hu ^ t\ (59) 

where t^ , . . . , i^ is the list of all ground terms in signature 
Eo that have at most k occurrences of constant cf. We 
replace |l|u^ > fc -I- 1 by (59 1 and again convert the result 
to a disjunction of base formulas where u^ occurs only in 
shapeBase subformula. 

Each of the resulting base formulas j3^ are such that every 
uncovered variable in (3^ is a shape variable that occurs only 
in shapeBase. Let 



n' 



3m1, . . . ,Un,u\,. . . 

shapeBase(iii, 



pS , U,pS_j.;^ , . 



■ 5 '^p^ + g 



termBase(ui, . . . , Un,x\, . . . , x,-a) A 
hom(ui, . . .,u„,u\,. .. ,ul^) A 
cardin(up 



+ii • 



, ^p+q : "pS + 1 5 



•'pS + g' 



) 

where Ui, . . . , u^s are uncovered shape variables. Then /3^ 
equivalent to /3^: 

P = ^Ul, . . . ,Un,UpS^l, . . . jUpi^gS. 

(p [^UpS^i , . . . , UpS^qB , X\ , . . . , XjjiB j 

termBase(iii, . . . , it„, xi, . . . ,Xm) A 
hom(ui, . . . ,u„,u\,. . . ,Un) A 
cardin(iip+i, . . . ,Up+q,«ps+i, . 
Here cjf>^ is a base formula (Definitions 



■ 7 ^p^ + q 



) 
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and 



21 1 whose free 



variables are variables free in /3^ as well as all covered shape 
variables: 



<?!>^(MpS + i,. 



, UpS^_qS , Xi^ . 



= 3ul, 



shapeBase(7ii,. . . ,Mps,Ups_,_i,'iips+,5,a;|, . . . ,x^s) 



Applying Lemma 37 we conclude that 0^ is equivalent to 
some disjunction 

k 

J.3,i 



V 



result of replacing . 
to 



with 



i)^-* in /3^. Then /3^ is equivalent 



Y/33. 



and each /3 '* has no uncovered variables either, because 
every free va riab le of 0'^'' is either free or covered in /J^'*. 
By Corollary 49 each f3^'' can be written as a quantifier free 
formula. ■ 

The following Theorem |54] corresponds to |39] of Sec- 
tion EH 



Theorem 54 (Two Constants Quant. Elimination) 

There exist algorithms A, B such that for a given formula 
(j> in the language of Figure TBc 

a) A produces a quantifier-free formula (j)' in selector lan- 
guage 

h) B produces a disjunction (j) of structural base formulas 

Proof. Analogous to proof of Theorem |39| using Propo- 
sition [45] in place o f Pr oposition |28| and Proposition [53] in 
place of Proposition |38| ■ 



Corollary 55 The first- order theory of the structure FT2 is 
decidable. 

This completes description of our quantifier elimination 
for the first-order theory of structure FT2, which models 
structural subtyping with two base types and one binary 
constructor. It is straightforward to extend the construc- 
tion of this section to any number of covariant constructors 
if the base formula has only two constants. In Section [5] we 
extend the result to any number of constants as well. Fi- 
nally, in Section |6] we extend the result to allow arbitrary 
decidable structures for primitive types, even if the number 
of primitive types is infinite. 

5 A Finite Number of Constants 

In this section wc prove the decidability of structural sub- 
typing of any finite number of constant symbols (primitive 
types) and any number of function symbols (constructors). 
We first show the result when all constructors are covariant, 
we then show the result when some of the constructors are 
contravariant. 

We introduce the notion of E-term-power of some struc- 
ture C as a generalization of the structure of structural sub- 
typing. 

We represent primitive types in structural subtyping as 
a structure C with a finite carrier C. We call C the base 
structure. Without loss of generality, we assume that C has 
only relations; functions and constants are definable using 
relations. Let Lc be a set of relation symbols and let < £ Lc 
be a distinguished binary relation symbol. < represents the 
subtype ordering between types. C is finite, so C is decidable 
(see Section [g] for the case when C is infinite but decidable). 

We represent type constructors as free operations in the 
term algebra with signature E. To represent the variance 
of constructors we define for each constructor / G E of 
arity ar(/) — k and each argument 1 < i < fc the value 
variance(/, i) G {—1,1}. The constructor / is covariant in 
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argument i iff variance(/, i) = 1. For convenience we assume 
ar(/) > 1 for each / G E. 

The E-term-power of C is a structure V defined as fol- 
lows. Let E' = E U C. The domain of V is the set P of 
finite ground E'-terms. Elements of C are viewed as con- 
stants of arity 1. The structure V has signature EuLc- The 
constructors / G E are interpreted in P as in a free term 
algebra: 

i/r(ii,...,tfe) = /(ii,...,ife) 

A relation r G Lc\{<} is interpreted pointwise on the terms 
of same "shape" as follows, fr}^ is the least relation p such 
that: 



1. iflrf( 

2. a piui 



ci, . . . ,c„) then p(ci, . . . ,c„) 

, . . . , tin) for all i where 1 < i < k, then 



Pifiti 



, tlfe), . . . ,/(tnl 



,t„k)) 



The relation < G Lc is interpreted similarly, but taking into 
account the variance. [[<]'' is the least relation p such that 

1- if |<I'^(ci,C2) then p(ci,C2) 
2. if 



variance(/,i) 



\i^il 5 • . . ) t^in ) 



for all i where I < i < k, then 

p(/(tll, . . . ,ilfc), . . . , f{tnl, ■ ■ ■ ,t„k)) 

Here we use the notation p^ for u G { — 1, 1} with the mean- 
ing: p^ ^ p and p-^ = {{y, x) \ (a;, y) G p}. 

We next sketch the decidability of structural subtyping 
for any finite number of primitive types C. For now we as- 
sume that all constructors / G E are covariant, the relation 
< thus does not play a special role. 

5.1 Extended Term-Power Structure 

For the purpose of quantifier elimination we define the struc- 
ture Ve by extending the domain and the set of operations 
of the term-power structure V. 

The domain of Ve is Pe ~ P ^ Ps where Ps is the set 
of shapes defined as follows. Let E' = {c'} U {/' | / G E} 
be a set of function symbols such that c' is a fresh constant 
symbol with ar(c^) — and P are fresh distinct constant 
symbols with ar(/^) — ar(/) for each f G T,. The set of 
shapes Ps is the set of ground E^-terms. When referring to 
elements of Ve by term we mean an element of P; by shape 
we mean an element of Ps- We write X' to denote an entity 
pertaining to shapes as opposed to terms, so x^,u^ denote 
variables ranging over shapes, and f to denotes terms that 
evaluate to shapes. 

The extended structure Ve contains term algebra opera- 
tions on terms and shapes (including selector operations and 
tests, |22[ Page 61]), the homomorphism sh, and cardinality 
constraint relations |(;/)|ts = k and \(l}\ts > k: 

1. constructors in the term algebra of terms, / G E' 

2. selectors in term the algebra of terms. 



i/.r^(/(ti 



, ife)) — ti 



3. constructor tests in the term algebra of terms, 

i\sfl^^t) = 3t^,...,tk.t = f{ti,...,tk); 

4. constructors in the term algebra of shapes, f^ G E^ 

irr-(ii,...,4) = r(ti,...,4); 



5. selectors in the term algebra of shapes, 

U!r^{f{n,---,ti))^t]; 

6. constructor tests in the term algebra of shapes, 

l\sfsj-^Htl = m,---,ti.f = fiti,...,tiy, 

7. the homomorphism mapping terms to shapes such 
that: 



ishr-(/(ti 



,tn)) 



shapified(/)(Ishr^(ti),. 
where 

shapified(a;) — c^, 
shapified(/) = f, 

cardinality constraint relations 



(Xl, 



(xi, 



,Xk)Us = kr^iti 



,xk 



Hti 



and 



lUixi,... ,Xk) 



r-(ti, 



.ishr-(i„)) 

if s:G C 
if/GE 

...,tfc) = 
..,tfc)| =fe 

■,tk) = 
,tk)\ > k 



(60) 



(61) 



(62) 



(63) 



where 0(a;i, . . . , Xk) is is a first-order formula over the 
base-structure language Lc with free variables 
xi, . . . ,Xk, term t^ denotes a shape, and fc is a 
nonnegative integer constant. 

It remains to complete the semantics of cardi- 
nality constraint relations, by defining the set 



[Xl,.. 



,Xk. 



n'P-B 



(ii,...,tfc). If s is a shape, we call 
the set of positions of constant c' in s leaves of s, and 
denote it by leaves(s). We represent a leaf as a sequence of 
pairs {/, i) where / is a constructor of arity k and 1 < i < k. 
If I G leaves(s) and sh(f) — s, then t[l] denotes the element 
c G C at position I in term t i.e. if I = {f^,i^) ■ ■ ■ (/",*"} 
then 

m = frA- ■ ■ fMfMt)) ■ ■ ■) (64) 



We define: 








mxi,...,xk)r^ 


-(ii,.. 


,tk 




{l\ Uixi,.. 


■ ,xk)f 


(ii 



[ii...,tkm 



(65) 



The following equations follow from ( 65 1 and can be used as 
an equivalent alternative definition for cardinality relations: 



(a;i,...,a;fe)]^^(ci,...,Cfc)| = 



Xk)j (ci,...,Cfe) 



(Xl, 



+ 



J-, ll<Pia;i, 

0, -^l(l){xi,...,Xk)f{ci,...,Ck) 

,Xk)r^{f{tii,...,tu),...,f{tk 

ixi,...,xk)r''{tii,...,tki)\ + 

{xi,...,Xk)}'^'^{tu,...,tkl)\ 



(66) 



tfci. 



,iw))l 



^ (67) 

Definition (651 generalizes [141 Definition 2.1, Page 63]. 
We write ]0(ti, . . . ,tk)\t^ — k as a, shorthand for the 
atomic formula (|(;/)(xi, . . . ,Xk)\f^ = k){ti, . . . ,tk), similarly 
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for \<j){ti, ■ ■ ■ , tk)\t^ > k. This is more than a notational con- 
venience, see Section|6]for an approach which introduces sets 
of leaves as elements of the domain of Ve and defines a cylin- 
dric algebra interpreted over sets of leaves. The approach in 
this section follows |35| in merging the quantifier elimination 
for products and quantifier elimination for boolean algebras. 

Some of the operations in Ve are partial. We use the 
definitions and results of Section |2.3| to deal with partial 
functions. fi{t) is defined iff ls/(t) holds, fi{f) is defined 
iff ls/s(t'^) holds. Cardinality constraints \(j>{ti, . . . ,tk)\t^ ~ 
k and \(fi{ti, . . . ,tk)\t^ > k are defined iff sh(ii) — ... — 
sh(ffc) =f holds. 

The structure Ve is at least as expressive as V because 
the only operations or relations present in V but not in Ve 
are |r| for r £ Lc, and we can express fr}^ {ti, . . . ,tk) as 

hr{ti,...,tk)\sh(ti) = 0. 

Our goal is to give a quantifier elimination for first-order 
formulas of structure Ve- By a quantifier-free formula we 
mean a formula without quantifiers outside cardinality con- 
straints, e.g. the formula [ix.x < t\x^ = fc is quantifier- free. 

5.2 Structural Base Formulas 

In this section we define the notion of structural base for- 
mulas for any base structure C with a finite carrier. 

Definition [56] of structural base formula for quantifier 
elimination in Ve differs from Defiiution|4l]in the conjuncts 
of cardin subformula. Instead of cardinality constraints on 
boolean algebra terms, Definition [56] contains cardinality 
constraints on first-order formulas. 

The notion of base formula and Lemma[25]apply to terms 
P as well as shapes Ps in the structure Pe because shapes 
are also terms over the alphabet E^. For brevity we write u* 
for an internal shape or term variable, and similarly x* for 
a free shape or term variable, t* for terms, /* for term or 
shape term algebra constructor and /,* for a term or shape 
term algebra selector. 

Definition 56 (Structural Base Formula) 

A structural base formula with: 

• free term variables Xi, . . . , Xm,; 

• internal non-parameter term variables ui, . . . , Up; 

• internal parameter term variables iip+i, . . . , Up+q; 

• free shape variables x\, . . . , aj^s/ 

• internal non-parameter shape variables u\, . . . ,Ups; 

• internal parameter shape variables Ups, . . . , Ups^^s 

is a formula of the form: 

3ui, . . . ,u„,u\, . . . ,u^s. 
shapeBase(iii, . . . ,u^s,a;|, . . . ,x^^s) A 
termBase(iii, . . . , u„,a;i, . . . ,a;,„) A 
termHom(iti, . . . , ii„,iii, . . . , u^s) A 
cardin(up+i, . . . ,u„,?ips+i, . . . ,u^s) 

where n = p + q, n' = p'^ + q^, and formulas shapeBase, 
termBase, termHom, cardin are defined as follows. 

shapeBase(ui,.. . ,u^„s,x\, . . . ,xl„s) = 

p m^ 

f\ ul = ti{u\, . . . , u^s) A f\ xl = Uj, 

i=l i = l 

A distinct(iii, . . . ,u^) 



where each ti is a shape term of the form f^{u\^ , ■ ■ ■ , nl^^ ) 
for some / £ Eo, fc = ar(/), and 

j : {1, . . . , m^} -^ {1, . . . , n^} is a function mapping indices 
of free shape variables to indices of internal shape variables. 

termBase(tti,. ..,Un,xi,... ,Xm) = 

p m 

/\ Ui =ti{ui,. . . ,Un) A l\ Xi ^ Uj. 
1=1 i=l 

where each ti is a term of the form f{ui^ , ■ ■ ■ , Ui^. ) for 
some f £ T,, k — ar(/), and j : {1, . . . , m} — > {1, . . . , n} is 
a function mapping indices of free term variables to indices 
of internal term variables. 



termHomfui, 



, M„,Ui,. . . ,M^s) = f\ s[\{Ui) = ul 



where j : {1, . . . , n} — > {1, . . . ,n^} is some function such 

that {ji, . . . ,jp} C {1, . . . ,p=} and 

{jp+i, . . . ,jp-t-q} C {p' -I- 1, . . . ,p' + g""} (a term variable is 

a parameter variable iff its shape is a parameter shape 

variable). 



cardin(up+i, . . . ,u„,Ups^i, . . . ,u^s) ^ ipi A ■ ■ ■ Aipd 

where each tpi is a cardinality constraint of the form 

\(f){uj^,...,Uj,)\u<' = k 

or 

\(f){uj^,...,Uji)\u' > k 
where {ji, . . . ,ji} C {p -|- 1, . . . , n} and the conjunct 
sh(uj^) = u'^ occurs in termHom for 1 < d < I. We require 
each structural base formula to satisfy the following 
conditions: 

PO) the graph associated with shape base formula 

3u\,.. .,u„s. shapeBase(ui,...,u^s,a;i,. ..,a::'„s) 

is acyclic; 

PI) congruence closure property for shapeBase subformula: 
there are no two distinct variables ul and u^j such that 
both u\ = /(uf J ,...,u\^) andu]= f{u]^ , . . . , uf J 
occur as conjuncts in formula shapeBase; 

P2) congruence closure property for termBase subformula: 
there are no two distinct variables Ui and Uj such that 
both Ui = /(uij , . . . , Mjj^ ) and Uj = f{ui^ ,... ,ui^) 
occur as conjuncts in formula termBase; 

P3) homomorphism property o/sh; for every 
non-parameter term variable u such that 
u = f{ui-^ , . . . , Uij.) occurs in termBase, if conjunct 
sh(u) = u^ occurs in termHom, then for some shape 
variables u^j^ , . . . , itj^ term u^ = f^{u^j^ , ■ ■ ■ , ""jj. ) 
occurs in shapeBase where f^ = shapified(/) and for 
every r where 1 < r < fc, conjunct sh(iti^) = it^,, 
occurs in termHom. 

Note that the validity of the occur check for term variables 
follows from PO) and P3). Another immediate consequence 
of Definition [56] is the following Proposition |57[ 

Proposition 57 (Quantification of Str. Base Form.) 

If fi IS a structural base formula and x a free shape or term 
variable in fi, then there exists a base structural formula /3i 
equivalent to 3x.[3. 

We proceed to show that a quantifier-free formula can be 
written as a disjunction of base formulas, and a base formula 
can be written as a quantifier-free formula. 
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5.3 Conversion to Base Formulas 

Conversion from a quantifier-free formula to tlie structural 
base formula is given by Proposition |57[ The proof of Propo- 
siti on|58| is analog ous to the proof of Proposition[45]but uses 
of m) instead of ((251. 



Proposition 58 (Quantifier-Free to Structural Base) 

Every well-defined quantifier-free formula (j> is equivalent 
on Ve to true, false, or some disjunction of structural base 
formulas. 

5.4 Conversion to Quantifier-Free Formulas 

The conversion from structural base formulas to quantifier- 
free form ulas is similar to the case of two constant symbols 
in SectionlXs] but requires the use of Feferman-Vaught tech- 
nique. 

Definition 59 The set determinations of variable determi- 
nations of a structural base formula (3 is the least set S of 
pairs {u* ,t*} where u* is an internal term or shape variable 
and t* is a term over the free variables of (3, such such that: 

1. if x* — u* occurs m termBase or siiapeBase, then 

{u* ,x*) G S; 



2. if {u* jt") £ S and u* — f*{ul, . . . , 
shapeBase or termBase then 
{{nl,f^{t')),...,{ulJUn)}^S, 



u1) occurs in 



3. if {{uj, /i*(t*)), . . . , {ulJtit'))} C 5 and 

u* — f*{ul, . . . ,uX) occurs in shapeBase or termBase 
then {u* ,t*) G S; 



4- if {u,t) G S and sh{u) 

{u\sh{t)) e s. 



u^ occurs in termHom then 



Definition 60 An internal variable u* is determined if 
{u* ,t*) G determinations /or some term t'^. An internal vari- 
able is undetermined if it is not determined. 

Lemma 61 Let P be a structural base formula with ma- 
trix Po and let determinations be the determinations of /3. If 
{u*,t*) G S then \= /3o => w* = t* . 

Corollary 62 Let (3 be a structural base formula such that 
every internal variable is determined. Then (3 is equivalent 
to a well-defined quantifier-free formula. 



Proof. By Lemma [61] using 

3x.x — t A d) 



m 



(68) 



Lemma 63 Let u be an undetermined non-parameter term 
variable in a structural base formula (3 such that u is a source 
i.e. no conjunct of the form 

U = /(ltl,...,U, ... ,Uk) 

occurs m termBase. Let (3' be the result of removing u and 
conjuncts containing u from (3. Then (3 is equivalent to [3 . 

Proof. The conjunct containing u in termHom is a conse- 
quence of the remaining conjuncts, so we drop it. We then 
apply jesl. ■ 



Corollary 64 Every base formula is equivalent to a base 
formula without undetermined non-parameter term vari- 
ables. 

Proof. If a structural base formula has an undeter- 

mined non-parameter term variable, then it has an unde- 
termined non-parameter term variable that is a source. Re- 
peatedly apply Lemma |63] to eliminate all undetermined 
non-parameter term variables. ■ 

The following Lemma[65]is a consequence of the fact that 
terms of a fixed shape s form a substructure of V isomorphic 
to the finite power C™ where m = |leaves(s)| and follows 
from Feferman-Vaught theorem in Section [3. 3[ 

Lemma 65 Let 

a = 3u. sh{u) = u^ A 

sh(Mi) = u" A ... A sh(Mfc) = u" A 

tpi A ... A tpp 



(69) 



where each xpi is a cardinality constraint of the form \(j)\'u!' = 
k or I <^|„s > k where all free variables of <j) are among 
u,ui, . . . ,Uk. Then there exists formula ip such that tp 
is a disjunction of conjunctions of cardinality constraints 
\(j)'\ = k and 10' | > k where the free variables m each 4>' are 
among ui, . . . ,Uk and formula a is equivalent on Ve to a' 
where 



sh(ui) = ?i' A... A sh(ufc) = u' A i/" 



(70) 



Proposition 66 (Struct. Base to Quantifier-Free) 

Every structural base formula (3 is equivalent on Ve to 
some well-defined quantifier-free formula (f). 

Proof Sketch. By Corollary [64] we may assume that /3 has 
no undetermined non-parameter term variables. By Corol- 
lary [62] we are done if there are no undetermined variables, 
so it suffices to eliminate undetermined parameter term vari- 
ables and undetermined shape variables. 

Let u be an undetermined parameter term variable, u 
does not occur in termBase because it cannot have a succes- 
sor or a predecessors in the graph associated with term base 
formula. Therefore, u occurs only in termHom and cardin. 
Let u^ be the shape variable such that u^ — sh(u) occurs 
in termHom. Let ipi, . . . ,7/)p be all conjuncts of cardin that 
contain u. 

Each ipi is of the form |<^|„s > ki or |<^j„s — ki and for 
each variable u' free in (j> the conjunct sh{u) — u^ occurs 
in termHom. The base formula can therefore be written in 
form 

(3i = 3x1, . . . ,Xe,x\, . . . ^x^'f. (f) A a 

where a has the form as in Lemma |65] Applying Lemma [65] 
we eliminate u and obtain i/; = V^Li ^i where and each ai is 
a conjunction of cardinality constraints. Base formula /3i is 
thus equivalent to the disjunction ViLi /5i,» where each /3i,i 
is a base formula 



Pl,^ 



3xi, 



)2;i, 



,2;}. (j) A 01,, 



By repeating this process we eliminate all undetermined pa- 
rameter term variables from a base formula. Each of the 
resulting base formulas contains no undetermined term vari- 
ables. 
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It remains to eliminate undetermined shape variables. 
This process is similar to term algebra quantifier elimi- 
nation; the key ingredient is Lemma |25[ which relies on 
the fact that undetermined parameter variables may take 
on infinitely many values. We therefore ensure that un- 
determined parameter shape variables are not constrained 
by term and parameter variables through conjuncts outside 
shapeBase. 

Consider an undetermined parameter shape variable u^. 
U^ does not occur in term Horn, because all term variables 
are determined and a conjunct u' = sh{u) would imply that 
M^ is determined as well, u^ can thus occur only in cardin 
within some cardinality constraint |(^|„s = k or j(j!)|us > k. 
Moreover, formula (j) in each such cardinality constraint is 
closed: otherwise would contain some free variable u, by 
definition of base formula u would have to be a parame- 
ter variable, all parameter term variables are determined, 
so M^ would be determined as well. Let u' denote some 
shape s. Because is a closed formula, \(j)\ is equal to 
if |[<^] — false and to the shape size m = |leaves(s)| if 
|[<^] = true. (The fact that closed formulas reduce to the 
constraints on domain size appears in [351 Theorem 3.36, 
Page 13].) After eliminating constraints equivalent to = A; 
and > fc, we obtain a conjunction of simple linear con- 
straints of the form m = k and m > k. These constraints 
specify a finite or infinite set S C {0, 1, . . .} of possible sizes 
m. Let j4 = {s I |leaves(s)| £5}. If the set S is infinite 
then it contains an infinite interval of form {mo, mo + 1, . . .} 
so the set A is infinite. If E contains a unary construc- 
tor and S is nonempty, then A is infinite. If E contains 
no unary constructors and S is finite then A is finite and 
the cardinality constraints containing u^ axe equivalent to 
\/?_j M^ — t\ where A = {t\, . . . ,t\,}. We therefore gener- 
ate a disjunction of base formulas f5i, . . . , I3p where I3i re- 
sults from l3 by replacing cardinality constraints containing 
u^ with with u^ — t]. We convert each /3i to a disjunction 
of base formulas by labelling subterms of ti with internal 
shape variables and doing case analysis on the equality be- 
tween new internal shape variables to ensure the invariants 
of a base formula, as in the proof of |58[ By repeating this 
process for all shape variables u^ where the set S is finite, 
we obtain base formulas where the set A is infinite for every 
undetermined parameter shape variable u^. We may then 
eliminate all undetermined parameter and non-parameter 
shape variables along with the conjuncts that contain them. 
The result is an equivalent formula by Lemma |25] 

All variables in each of the resulting base formulas are 
determined. By Corollary |62] each formula can be written 
as a quantifier-free formula, and the resulting disjunction is 
a quantifier-free formula. ■ 



5.5 One-Relation-Symbol Variance 

So far we have assumed that all constructors are covariant. 
In this section we describe the changes needed to extend 
the result to the case when the constructors have arbitrary 
variance with respect to some distinguished binary relation 
denoted <. 

Definition 67 If (j> is a first-order formula in the language 
Lc the contravariant version of (j), denoted (j) ' , ^s defined 



by induction on the structure of formula by: 

(r(ii,...,tfe))<-i' = r{ti,...,tk), ifreLc\{<} 



A-i) 



(ii < t2)^-'' 


= t2<ti 


;0iA<^2)<-'' 


= 0i(-i)a 


;0iv02)*~'' 


= <^i(-i)a 


(^0)(-i) 


= ^,^(-1) 


(3i.<^)(-i) 


= 3t.<^(-i' 



,(-1) 



(71) 

Define C~^ to have the same domain and same interpretation 
of operations and relations r G Lc \ {<} but where 

i<f " = {i<fr' (72) 

We clearly have for every formula (p and every valuation a: 

I0'-"f = Uf" (73) 

If I G leaves(s) is a leaf I = {f\i^) . . . {f",i"), define 
variance(i) as the product of integers 



TT variance(/"', j-' 



We generalize ( 65 1 to 

|<^(a;i,...,2;fe)] 
{I I I,/<(a;i, 



"""(ti, 



,tk) 



,x,)f(ti[i],...Mi])} 



(74) 



(75) 



where C' denotes C for variance^ — 1 and C~^ for variance/ — 
1. Hence, isomorhism between terms of some fixed shape 
s with |leaves(s)| — m and C™ breaks, but there is still an 
isomorphism with C^'"' x (C^^)^'"' where 



-P(s) = i{' £ leaves(s) | variance(Z) = 1}| 
^{^) ~ l{' £ leaves(s) | variance(i) = — 1}| 



(76) 



Because of this isomorphism. Lemma [65] still holds and we 
may still use Fefe rman-Vaught theorem from Section [3. 3[ 
Equation (|67| generalizes to: 



(Xl, 



,Xk, 



r«(/(tii,...,iio,...,/(tfei,...,tfcO)! 
= eLi \i4>^''"'""^^''^\x^,. . .,x,)r-{tu, . . . ,ifeOi 

(77) 
Th e on ly change inthe proof of Proposition [58] is the use 
of i\77\ instead of ( |67| |. Most of the proof of Proposition |66| 
remains unchanged as well; the only additional difficulty is 
eliminating constraints of the form |(j!)|„s = k and |0|„s > k 
where u' is a parameter shape variable and (f) is a. closed 
formula. Lemma [68] below addresses this problem. 

We say that an algorithm g finitely computes some func- 
tion / : A — > 2^ where B is an infinite set iff (; is a function 
from A to the set Fin(_B) U {co} where Fin(_B) is the set of 
finite subsets of set B, oo is a fresh symbol, and 



5(a) = 



/(a), if/(a)GFin(B) 
oo, if /(a) ^ Fin(B) 



(78) 
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Lemma 68 There exists an algorithm that, given a shape 
variable u^ and a conjunction i/) = AILi ^i of cardinality 
constraints where each tpi is of form \(j}i\u^ — ki or\(f)i\u' > h 
for some closed formula (jii, finitely computes the set 



A = {s\ M' 
of shapes which satisfy il) tnV . 



s]} 



(79) 



uf 


W-''f 


101. = 


true 


true 


P{s) + N{s) 


true 


false 


P{s) 


false 


true 


Nis) 


false 


false 






Proof Sketch. Let </> be a closed formula in language Lc- 
Compute |[(^] and Ji?!' 1 and then replace \4>\s with one 
of the expressions P{s) + N{s), P{s), N{s), according to 
the following table. 
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The constraints of the form N{s) + P{s) — k and N{s) + 
P{s) = k can be expressed as propositional combinations of 
constraints of the form N{s) = k, P{s) — k, P{s) > k and 
N{s) > k. Therefore, ^p can be written as a propositional 
combination of these four kinds of constraints and each con- 
junction C{s) can further be assumed to have one of the 
forms: 

Fl) Cfc^,fc„(s) = P{s) = kpAN{s) = kN; 

F2) C,^,fc+(s) = P{s) = kp A N{s) > kN-, 

F3) C,+ . (s) = P(s)>fcpAiV(s) = fcjv; 

F4) C.+ , + {s) = P{s)>kpAN{s)>kN. 

Let A = {s £ Ps \ C(s)}. To compute A when E contains 
unary constructors, we first restrict E to the language E' 
with no unary constructors, and compute the set A' C. A 
using language E'. If A' is empty, so is A, otherwise A is 
infinite. Assume that E contains no unary constructors. As- 
sume further E contains at least one binary constructor and 
at lest one constructor is contravariant in some argument. 
Let 

S = {{P{s),Nis)}\seA} 

Because P{s) + N{s) — |leaves(s)| and there are only finitely 
many shapes of any given size (every constructor is of arity 
at least two), it suffices to finitely compute S. S can be 
given an alternative characterization as follows. If / € E, 
ar(/) — k, f is covariant in I arguments and contravariant 
in k—l arguments define 



ffi{pr,ni),...,{pk,nk}) = 
(ELi P» + EL+1 "^^^ E Li '^« + EL+1 P») 
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Let U be the subset of { {p, n) \ p,n > 0} generated from 
element (1,0) using operations J/] for / £ E. Then 



{{p,n) G U\ c{p,n)} 



(82) 



where c{p, n) is the linear constraint corresponding to the 
constraint C{s). 



Let C{s) = Cfc^,ft„(s). Then S C {{p,n) \ p + n = 
kp + kN }■ S is therefore a subset of a finite set and is easily 
computable, which solves case Fl). 

Let C(s) — C.+ . + (s). Because E contains a binary con- 

structor, S contains pairs {p, n) with arbitrarily large p+n, 
so either the p components or n component of elements of 
S grows unboundedly. Because E contains a constructor / 
contravariant in some argument, we can define using / an 
operation o acting as a constructor covariant in at least one 
argument and contravariant in at least one argument. Using 
operation on tuples whose one component grows unbound- 
edly yields tuples whose both components grow unbound- 
edly. Therefore, S is infinite, which solves case F4). 

Finally, consider the case C{s) — C, . + (s) (this will 



solve the case C{s) 



■c^+ 



(s) as well). Observe that 



Cu,M^) = C,^Ms)^ A ^Cu, 



.(s) (83) 



Because the set 5* for each Ckp.i{s) is finite, it suffices to 
finitely compute S for Cj. o+(s). In that case 



Let 



s = {{p,n)eu\p = kp} 

S, = {{p,n)eU\p = i} 
T, = {{p,n)£U\n = i} 



(84) 



(85) 



To finitely compute 5", finitely compute the sets Si and Ti 
for < i < fcp. The algorithm starts with all sets Si and Ti 
empty and keeps adding elements according to operations 

iff- 

Assume that So, To, . . . , Si-i, T-i are finitely computed. 
The computation of Si and Ti proceeds as follows. Let / € E 
be a constructor of arity k with I covariant arguments. For 
Si we consider all solutions of the equation 



Pi-\ +pi +ni+i 



+ nk 



for nonnegative integers pi, . . . ,pi, nj+i, . . . , Wfc. First con- 
sider solution solutions where no variable is equal to i. If for 
one of the solutions, one of the sets Sp^ , ■ • ■ , Sp, is infinite, 
then Si is infinite, otherwise add to Si all elements {i, n) 
where 

n = ni-\ -f 71; -I- pi+i -)-...+ pfc (87) 

If n < fcp then also add the same elements (i, n) to r„. 
Next, proceed analogously with Ti, considering solutions of 



ni H hn; -l-pi+i 



+ Pfe = J 



If at this point Si is not infinite and not empty, then also 
consider the solutions of (871 where pj — i for some j. If 
such solution exists, then mark Si as infinite. Proceed anal- 
ogously with Ti. Finally, if both Si and Ti are still finite 
but there exists a solution for Si where ni+j = i for some j 
and exists a solution for Tj where pi+d = i for some d, then 
mark both Si and Ti as infinite. This completes the sketch 
of one step of the computation. (This step also applies to 
So and To; we initially assume that (1,0) G To.) ■ 

Example 69 Let us apply this algorithm to the special case 
where E = {/, g} and 



variance 



(ff) = (l,l) 
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variance(/) 



-1,1) 



Let us see what the set S looks hke. If {x, y) £ S define 
k{x,y} = {kx,ky) as in a vector space. 

First, (1,0) G S because of c^. Next (1, 1) £ S because 
of /^ and (2,0) € S because of g^. 

More generally, we have the following composition rule: 
If (pi,ni), {p2,n2) then 



(pi +P2,ni +n2) e 5" 



because of g^, and 

(m +P2,pi +112} e S 

because of f^. 

Using g^ we obtain all pairs (p, 0) for p > 1. Using f 
once on those we obtain {l,n) for n > 0. Adding these we 
additionally obtain (p, n) for p > 2 and n > 0. Hence we 
have all pairs (p, n) for p > 1 and n > and those are the 
only ones that can be obtained. Thus, 

S = {{p,n) |p> lAn> 0} 

As expected, the case Fl) yields a finite and the case F4) 
an infinite set. The case F2) for fcp = is an empty set, 
otherwise it is an infinite set. The case F3) always yields 
an infinite set. This solves the problem for two constructors 
.f,9- 



Lemma [68l allows to carry our the proof of Proposition |66| 
so we obtain our main result for finite C. 

Theorem 70 (Term Power Quant. Elimination) 

There exists an algorithm that for a given well-defined 
formula (p produces a quantifier-free formula (f>' that is 
equivalent to 4> on Ve- 

Corollary 71 (Decidability of Structural Subtyping) 

Let C be a structure with a finite earner and V a T^-term- 
power of C. Then the first-order theory of V is decidable. 

6 Term-Powers of Decidable Theories 

In this section we extend the result of Section[5]on decidabil- 
ity of term-powers of a base structure C to allow C to be an 
arbitrary decidable theory, even if the carrier C is infinite. 

To keep a finite language in the case when C is infinite, 
we introduce a predicate Ispri that allows testing whether 
t G C for a term t£ P. 

In structural base formulas, we now distinguish between 
1) composed variables, denoting elements t £ P for which 
ls/(t) holds for some constructor / G E, and 2) primitive 
variables, denoting elements t £ P for which IspRi(i) holds. 

Another generalization compared to Section [5] is the use 
of a syntactically richer language for term power algebras; to 
some extent this richer language can be viewed as syntactic 
sugar and can be simplified away. 

The generalization to infinitely many primitive types and 
the generalization to a richer language are orthogonal. 

For most of the section we focus on covariant construc- 
tors, Section [6 . 5| disc usses a generalized notion of variance. 

As in Section [3^ let C = (C, P) be a decidable structure 



where C is a non-empty set and _R is a set of relations inter- 
preting some relational language Lc, such that each r £ R 



lifted relations r' for r £ Lc 
r' :: term'' -^ bool 
term algebra on terms 

constructors, f £ E; 

/ :: term'' -^ term 
constructor test, f £ T,: 
Is/ :: term -^ bool 
selectors, / G E; 

fi :: term -^ term 

Figure 8: Basic Operations of E-term-power Structure 

is a relation of arity ar(r) on set C, i.e. r C (j^'(^\ We 
assume that R contains a binary relation symbol r^ £ R, 
interpreted as equality on the set C. 

Operations and relations of the E-term-power structure 
are summarized in Figure IS] We will show the decidability of 
the first-order theory of the structure with these operations. 

In the special case when C — {a, 6} and 

r = {{a,a),{a,b),{b,b)} 

we obtain the theory in Section |4] When R — {r} where r is 
a partial order on types, we obtain the theory of structural 
subtyping of non-recursive covariant types. For arbitrary 
relational structure C, if / G E for ar(/) = fc we obtain a 
structure that properly contains the fc-th strong power of 
structure C, in the terminology of |35) . 

The structure of this section follows Sections [4] We also 
associate a boolean algebra of sets with each term t. How- 
ever, in this case, the elements of the associated boolean 
algebra are sets of occurrences of the constants that sat- 
isfy the given first-order formula interpreted over C. The 
occurrences of constants within the terms of a given shape 
correspond to the indices of the product structure in Sec- 
tion [3]3] We call these occurrences leaves, because they can 
be represented as leaves of the tree corresponding to a term. 

6.1 Product Theory of Terms of a Given Shape 

In this section we define the notions shape and leafset, and 
state some properties that we use in the sequel. 
Let 

Eo = K}u{ri/eS} 

be a set of function symbols such that c^ is a fresh constant 
symbol with ar(c^) — and f^ are fresh distinct constant 
symbols with ar{f^) — ar(/) for each / G E. Let shapified : 
E' —> Eo be defined by 

shapified(a::) = c^, if a; G C 

shapified(/) = f, if / G E 

Let FT(Eo) be the set of ground terms with signature Eo 
and FT(E') the set of ground terms of signature E'. 

Define function sh :: FT(E') -^ FT(Eo) mapping each 
term to its shape by 

sh(/(ii, . . . , i„)) = shapified(/)(sh(ti), . . . , sh(t,0) 
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for each / G E'. Define ti ~ t2 iff sh(ti) = sh(i2). 

Let t b e a t erm or shape and t' the tree representing t as 
path such that t'{p) is defined and 



in Section 



2.2 



If p is 
denotes a constant, we write t[p] to denote t'{p) and caU p a 
leaf. Note that t[p] is defined iff sh(t)[p] is defined. On the 
set of equivalent terms leaves act as indices of Section [3.3[ If 
s is a shape, let leaves(s) denote the set of all leaves defined 
on shape s. 

Generalizing tCont of Section [4. 1| define function tCont : 
FT(E') -> C" by: 



tCont(c) 
tCont(/(ti,...,tfc)) 



c^ffcG C 
tCont(ti) ■ . 



■ tCont(ffc 



Define S{t) = {sh(i), tCont(i)) and 

B = {{s,w) I s e FT(E()), w G C*,tLen(s) = sLen(w)} 

If all constructors / G E are covariant then 5 is a bijection 
between FT(E') and B. Let 

B{so) = {{s,w) € B\ s^so} 

For a fixed sq, the set -B(so) is isomorphic to the power 
structure C" where n — tLen(s). 

For e ach shape s we introduce operations from Sec- 
tion |3.3| To distinguish the sets of positions belonging to 
different shapes, we tag each set of positions L with a shape 
s. We call the pair {s,L) a leafset. The interpretation of 
each relation r G Lc is the leafset: 



We can now express relations r' in Figure fusing the fact: 

r'(fi,...,ifc) ^=^ 
sh(t2) = sh(ti) A... A sh(tfc) =sh(ii) A (89) 

r,h(tj)(ii,...,tfc) =true!,h(,^) 

To handle an infinite number of elements of the base 
structure C, we do not introduce into the language constants 
for every element of C as in Sectionis] Instead, we introduce 
the predicate Ispri :: term — > bool called primitive-term test 
that checks whether a term is a constant: 

lspRi(a;) = [x £C) 

and the predicate lspR|L :: leafset -^ bool called primitive- 
leafset test: 

ISPRIL ({s,L)) = {s = c) 



Instead of the rule ( 16 1, we have for /, g G E U {PRI}: 



yx. V Is/ (a;) 

/eEU{PRI} 

Vx. ^(Is/(a;) A lsg(3;)), for / ^ g 

Analogous rules hold for term algebra of leafsets: 
yx. V ls/L(a;) 

/eEU{PRI} 

Vx. ^(Is^l(2:) AlSgL(2:)), ior f ^ g 



(90) 



(91) 



IrsUti 



,tk) = {s,{p\lrfit4p],...,t^[p])}) 



Term algebra of shapes satisfies the original rules (161 of 
term algebra. 



complement, full set and empty set in the algebra of subsets 
of the set leaves(s). We also introduce 3s as the union of a 
family of subsets indexed by a term of shape s and V!, as the 
intersection of a family of subsets indexed by a term. 

We use constructor-selector language for the term alge- 
bra on terms. We introduce constructor-selector language 
on shapes by generalizing operations in Section [4.1| in a nat- 
ural way. In addition, we introduce a constructor-selector 
language on leafsets. For each / G E we introduce a con- 
structor symbol /'" on leafsets and define 

leaf if led (/) = /'" 

Constructors / act on leafsets as follows. If Li C leaves(si) 
for 1 < i < k define 

/({si,Li),...,{sfc,Lfc)) = {s,L) 

where s = /'^(si, . . . , s^), and L C leaves(s) is given by 

L = ({l}-Li) U---U {{k}-Lk) 

(Here we define A-B = {a-b\aGAAbe B}.) 

We define selector functions on leafsets as follows. If s = 
/^(si, . . . , Sfc) and L C leaves(s), then fi{{s,L)) = {si,Li) 
where Li C leaves(si) is defined by 

Li — {w I ui ■ i G A} 

Equivalently, we require that 

fHf'~{{suLi), ..., {s„, L„))) = {s„ Li) 



6.2 A Logic for Term-Power Algebras 

To show the decidability of the first-order theory of the 
structure FT* with operations in Figure [8] we show decid- 
ability for a richer structure. Figure |9] shows the operations 
and relations of this richer structure. 

The structure has four sorts: bool representing truth val- 
ues, term representing terms, shape representing shapes, and 
leafset representing sets of leaves within a given shape. The 
structure can be seen as as a combination of the operations 
of Figure [5] and Figure [2] 

For each relation symbol r G /J we define a relation sym- 
bol r* of sort shape x term*^ — + bool acting on terms of the 
same shape. While in Section |4.2| we associate a boolean 
algebra with the terms of same shape, in this section we 
associate a cylindric algebra [2T] with terms of the same 
shape. This is a particularly simple cylindric algebra re- 
sulting from lifting first-order logic on the base structure 
C so that elements are replaced by terms of a given shape 
(which are isomorphic to functions from leaves to elements), 
and boolean values are replaced by sets of leaves (isomor- 
phic to functions from leaves to booleans). In both cases, 
operations on the set X are lifted to operations on the set 
leaves(s) ^ X. Syntactically, we introduce a copy of all 
propositional connectives and quantifiers: a'_, v'_, -t|_. true_, 
false'^. Like boolean algebra operations in Figure Isl these 
syntactic constructs in Figure [9] take an additional shape 
argument, because term-power algebra contains one copy of 
a strong power C" of base structure for each shape. We call 
formulas built using the operations of the cylindric algebra 
inner formulas. 
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per-shape product structure 

inner formula relations for r G Lc'- 

r_ :: shape x term*^ -^ leafset 
inner logical connectives: 

^L v'_ :: shape x leafset x leafset -^ leafset 
-^'_ :: leafset -^ leafset 
true'^jfalse'^ :: leafset 
inner formula quantifiers: 

S^V, :: shape x (term -^ leafset) -^ leafset 
leafset equality: 

—'~ :: leafset x leafset -^ bool 
leafset cardinality constraints, k > 0: 
-|_ > k, |_|_ = k 
leafset quantifiers: 

term equality: 

= :: term x term -^ bool 



shape X leafset -^ bool 
(leafset -^ bool) -^ bool 



term quantifiers: 

3,V :: (term -^ bool) -^ bool 



shape equality: 

s 

shape quantifiers: 

3'X ■■ 
logical connectives: 

A,V : 

true, false, undef : 



shape X shape -^ bool 

(shape -^ bool) -^ bool 

bool X bool -^ bool 
bool -^ bool 
bool 



term algebra on terms 

constructors, f £ T,: 

f :: term* -^ term 
constructor test, / G S; 
Is/ :: term -^ bool 
primitive-term test: 
IspRi :: term -^ bool 
selectors, f gT,: 

fi :: term -^ term 
term shape: 

sh :: term -^ shape 

term algebra on leafsets 

constructors, / G S: 

/'- :: leafset*" -^ leafset 
constructor tost, / G S: 



Is,, 



leafset -^ bool 



primitive-leafset test: 
lspR|L :: leafset -^ bool 
selectors, f £ T,: 

fi :: leafset -^ leafset 
leafset shape: 
Issh :: leafset -^ shape 

term algebra on shapes 

constructors, f G So; 

P :: shape*" -^ shape 
constructor tost, / G Sq: 
Is/s :: shape -^ bool 
selectors, / G S: 

fi :: shape -^ shape 



Figure 9: Operations and relations in structure V 
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For each operation in Figure [2] there is an operation in 
Figure |9] potentially taking a shape as an additional argu- 
ment (for operations used to build inner formulas). The logic 
further contains term algebra operations on terms, leafsets, 
and shapes. 

We use undecorated identifiers (e.g. u) to denote vari- 
ables of term sort, variables with superscript S to denote 
shape variables (e.g. u^) and variables with superscript L to 
denote leafset variables (e.g. u'"). 

Figures [To] and [TT] show the semantics of logic in Fig- 
ure [9] The first row specifies semantics of operations in the 
case when all arguments are defined and are in the domain 
of the operation. The domain of each operation is in the 
second column, it is omitted if it is equal to the entire do- 
main resulting from interpreting the sort of the operation. 
All operations except for plain logical operations and quan- 
tifiers over the bool domain are strict. Logical operations 
and quantifiers over the bool domain are defined as in the 
three- valued logic of Section [2. 3| 

We remark that values of leafset act as terms with two 
constants in Figure [S] In fact, if the base structure C has 
only two constants then the formula x — a and its proposi- 
tional combinations are sufficient to express all facts about 
C, so in that case there is no need to distinguish between 
terms and leafsets. 

6.3 Some Properties of Term-Power Structure 

In this section we establish some further properties of the 
term-power structure, including the homomorphism proper- 
ties between the term algebra of terms and the term algebra 
of leafsets. We also argue that it suffices to consider a re- 
stricted class of formulas called simple formulas. 

Recall that r^ £ _R is the equality relation on C. Given 
r^, we can express the equality between terms by: 



ii =i2 



r-'{ti,t2) 



sh(t2) = sh(ti) Ar-(ti,f2) = true^, 



=sh(ti) 

(92) 
We define the notion of a u'-term as in Definition |40] 
except that we use different symbols for boolean algebra 
operations. 

Definition 72 (u'^-terms) Let u' G Var'^ be a shape vari- 
able. The set of u^ -terms Term(it^) is the least set such that: 

1. u G Term(it^) for every leafset variable u ; 

2. false^s,true„s G Term(u'); 

3. ifti,t2 € Term(it^), then also 

fi aU ^2 G Term(M'), 

fi vjjs ^2 G Term(it^), and 

^Uti G Term(u^) 

If t^ is a term of shape sort, the notion of t'^-inner formula 
is defined as follows. 

Definition 73 (u'^-inner formula) Let u' G Var^ be a 

shape variable. The set of u^-inner formulas lnner(u'^) is 
the least set such that: 



1. if ui, . . . ,Uk are term variables and r G Lc such that 
ar(r) = k, then 

r„s(ui, . . . ,uk) £ lnner(u'') 

2. false^s, true^s G Inner(u^) 

3. if 4>i,(j)2 G lnner(u'^) then also 

4>i A^s 02 G lnner(u'^) 
4>i vIjs 4>2 G lnner(ii'') 
^^01 G Inner(M^) 

4. if (j) ^ Inner(u^) and u is a term variable that does not 
occur in u^ , then also 

3'^su.(j> G lnner(«^) 
y'y_su.(j) G Inner(it^) 

If (j} £ Inner(M^) and mi, . . . ,u„ is the set of free term vari- 
ables of 4>, we write 4>{u^, Ui, . . . ,u„) for (f>. Furthermore, if 
t' IS a term 0/ shape sort and ti, . . . ,in terms o/term sort, 
we write ^(i^, ti, . . . , t„) for 

(j)[u := i',Mi := ti, . . . ,u„ := i„] 

where we assume that variables bound by 3'_ and V'_ are 
renamed to avoid the capture of variables that are free in 

L , tl , . . . , in . 

We call (j>{t^, ii, . . . , i,i) an instance of the u^ -inner for- 
mula <j){u^,ui, . . . ,u„). 

If (j}{u^,Ui, . . . ,Un) is an inner formula, we abbrevi- 
ate it by writing [</)'(ui, . . . , Un)]us where </>' results from 
4>(u^ ,ui, . . . ,Uri) by omitting the shape argument u^ from 
the operations occurring in cj>{u^,u\, . . . ,Un). Similarly, we 
write [(/>'(ii, . . . ,t„)]ts for 4>{f,t-i,. . . ,t„). 

According to the semantics in Figure [To{ sh is a homo- 
morphism from the term algebra of terms to the term alge- 
bra of shapes. In addition, Issh is a homomorphism from the 
term algebra of leafsets to the term algebra of shapes. 

We also have the following important property. Let r G 
Lc be a relation symbol of arity n, let / G E be a function 
symbol of arity k, and let 



sh(fi 



■ sh{t„ 



for 1 < j < k. If f = shapified(/), /'" = leafified(/), and 
s = /'(si,. ..iSfc) then 



(93) 



rs{f{tll, . . .,tik), ..., fitnl, . . .,t„k)) = 

./''"(rsi(tll,... ,tnl),...,rs^{tlk,. .■,tnk)) 

Furthermore, if Issh(Zj) = Issh(Zj) — Sj for 1 < j < k and 
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interpretation of sorts 


[term] = 


FT(E') 


[shape] = 


FT(Eo) 


[leafset] = 


{(s,L) 1 iC leaves(s)} 


[bool] = 


{true, false, undef} 


semantics 





Ml])}) 



inner formula relations for r G Lc: 

M{s,h,...,U) = {s,{l\lrf{tM 
inner logical connectives: 

[a'](s,{si,Li),{s2,L2)) = (s, Lin La) 

[v'](s,(si,Li),(s2,L2)) = (s.LiULa) 

[^'](s,(si,Li)) = {s,leaves(s)\Li) 

|true'](s) = (s,leaves(s)) 

[false'] (s) = (s,0) 

inner formula quantifiers, for h : [term] -^ [leafset]; 

[3'](s,ft) = {s,U{L I 3f G [term]. sh(t) = sAft(i) = (s,L)}) 

[V'](s,ft) = {n{L|3tG[term].sh(t) = 6-A/i(i) = (s,L)},) 
leafset equality: 

[='-]({si,Li),{s2,L2)) = Sl=S2ALi=L2 

leafset cardinality constraints: 

[|(si,Li)U>fc] = (lLii>fc) 
[|(si,Li)U = fc] = (ILil^fc) 
leafset quantifiers, for h : [leafset] -^ [bool]; 

p'-jh = 3(s,i) G [leafset]. ft((s,i)) 
[V'-]ft = V{s,f) G [leafset]. fe({s,i)) 
term equality: 

[=](tl,i2) = (tl=i2) 

term quantifiers, for h : [term] -^ [bool]: 

[3] ft = 3iG [term]. ft(f) 

[V]ft = ViG [term]. ft(i) 
shape equality: 

[=1(t!,i|) = (t! = i|) 

shape quantifiers, for h : [shape] — > [bool]; 

[3^/1 = 3iG [shape]. ft(t) 
[V^ft = ViG [shape]. /i(t) 



well-definedness 

sh(ii) = s A . . . A sh(ifc) = s 

Sl = S A S2 = s 
Sl = S A S2 = s 
Sl = s 



Vt G [term]. Issh(ft(i)) = s 
Vi G [term]. Issh(ft(i)) = s 



Sl = s 
Sl = s 



Figure 10: Semantics for Logic of Term-Power Algebra (Part I) 
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semantics 




well-deflnedness 




term algebra on terms 






constructors, / G E; 








imti,---,u) 


= /(ti,...,tfe) 






constructor test, / G S; 








Ils/](t) 


= 3ii,...,ffc. i = /(ii,...,tfe) 






primitive-term test: 








IISPRiKi) 


= (iGC) 






selectors, / G E; 








mit) 


= et^.t = f{ti,...,ti,...,tk) 




Ils/](t) 


term shape: 








Ish(/(ti,...,i„))] 


= shapified(/)(sh(ti),...,sh(t„)) 
term algebra on leafsets 






constructors, / G E; 








I/i((si,Li),...,{sfe,L,)) 


= {/(si,...,Sfe),({l}-Li) U---U {{k}-L,)) 






constructor test, / G S; 








IIs^l]((s,^)) 


= 36-i,Li,...,Sfc,L,.{s,L) = I/L]((si,Li),... 


,(sfc,ife)) 




primive-leafset test: 








IIspr,l]((s,L)) 


= (s = c=) 






selectors, / G E; 








I/']((s,^» 


= e{s,, U). {s, L) = I/^]({si, Li), . . . , (s., L.), . 


••,{sfe,Lfe)) 


IIs^l]({s,L)) 


leafset shape: 








Ilssh]({.,L)) 


term algebra on shapes 






constructors, / G E; 








ir](si,--.,sO 


= nsi,...,s,) 






constructor test, / G Eq; 








IisrKs) 


= 3si,...,Sk. s = f{si,...,Sk) 






selectors, / G E; 








I/l](*-) 


= esi. s = r{si,. . . ,Si,. . . ,Sk) 




Ils/s](s) 



Figure 11: Semantics for Logic of Term-Power Algebra (Part II) 
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s = f{si,...,Sk) then 

f^ih,...,h)A[f{l[,...,l',)='- 

f^ih,...,h)w[f^{li,---,lk) = 

f i^sih, ■ ■ ■ i^sfc'ft) 
3U.fihr{t),...,hk{t))=^ 

f^{3[^t.h^it),...,3',^tMt)) 
yU.f(h^{t),...,hk{t))=^ 

/L(VL,t./ii(i),...,VL,t./ifc(i)) 



From these properties by induction we conclude that if 
(j>{u^, ui, . . . , Un) is an inner formula, then 



(94) 



(•5, /(ill, • • • ,tlk),- ■ ■ ,/(inl, ■ • ■ ,tnk)) = 

f'-{4>(si,tll, . . . ,f„l), . . . ,0(Sfe,ilfc, . . . ,t„ft)) 



(95) 



Let (j}{u^,ui, . . . ,u„) be an inner formula and let 
(j>'{u\, . . . ,Un) be a first-order formula that results from re- 
placing operations aI,, v!,, ->!,, V!,,3!, by A,\/,^, V, 3. Inter- 
preting 4>'{ui, . . . ,u„) over the structure C yields a relation 
p' c C". If 

sh(ti) = . . . = sh(ifc) = s 



then 



s,h,...,tk) = {s,{l\p'{h[l],...,tk[l])}) 



The following Definition |75| introduces a more restricted 
set of formulas than the set of formulas permitted by sort 
declarations in Figure [9] We call this restricted set of for- 
mulas simple formulas. One of the main properties of simple 
formulas compared to arbitrary formulas is that simple for- 
mulas allow the use of operations 3'_,V'_, and relations r_, 
r G Lc only within instances of u^-inner formulas. 

Definition 74 A simple operation is any operation or re- 
lation in Figure\^ except for operations 3'_,V'_, and relations 
r_ for r G Lc ■ 

Definition 75 (Simple Formulas) The set of simple for- 
mulas is the least set that satisfies the following. 

1. if 4'i''J'^, ui, . . . , Un) is a an inner formula, t^ a term of 
shape sort, ii,...,t„ terms o/ term sort and u'" is a 
leafset variable, then 



u = </>(i^fi, . . . ,t„) 



is a simple formula. 

2. applying simple operations to simple formulas yields 
simple formulas. 



Example 76 A formula 



is not a simple formula for u\ ^ u|. Formula 

{u\ =U2 A u'- ='- 3ls^u. ruiiu,u)) V 
{u\ ^ u\A undef) 



is a simple formula equivalent to formula (96 1. We abbrevi- 
ate 3^s M. r^s {u,u) as [3'm. r(u,it)]us . 



Lemma [77] shows that for every formula in the logic of 
Figure [9] there exists an equivalent simple formula. Note 
that even simple formulas are sufficient to express the re- 
lations of structural subtyping. A reader not interested in 
the decidability of the more general logic of Figure [9] may 
therefore ignore Lemma [77] 

Lemma 77 (Formula Simplification) For every well- 
defined formula m the logic of Figure^there exists an equiv- 
alent well-defined simple formula. 

Proof Sl^etch. According to the definition of simple for- 
mula, we need to ensure that every occurrence of quantifiers 
V'^, 3'^ and relations r_ is an occurrence in some inner-formula 
instance (j){f,t\, . . . ,t„). Each occurrence rt<-{t\, . . . ,t„) is 
an inner formula instance by itself, so the main difficulty is 
fitting the quantifiers V'_ and 3'_ into inner formulas. 

Let us examine the syntactic structure of formulas of 
logic in Figure [9] This syntactic structure is determined 
by sort declarations. Each expression of leafset is formed 
starting from 

1. relations r G Lc; 

2. leafset variables; 

3. true'^, false'^ 

using operations a'^, v'^, —'''_, V'^, 3'^, as well as /'" and f\. The 
leafset expressions can be used in a formula in the following 
ways (in addition to constructing new leafset expressions): 

1. to compare for equality using ='"; 

2. to test for the top-level constructor using Isxl; 

3. to form leafset cardinality constraints; 

4. to form a shape using Issh. 

Because the top-level sort of a formula is bool, every 
term tg of sort leafset occurs within some formula t\ =~ t\ 
or ISfL(i'"), \t^\t^ ~ k, |i'"|ts > fc or as part of some term 
lssh(t'"). We can replace ISfL(f'") with 

du . u — t A \Sj±(u ) 

according to Lemma [TO] so we need not consider that case. 
We can similarly eliminate non-variable leafset terms from 
cardinality constraints. If a leafset term t'" occurs in an 
expression lssh(i'"), we consider the smallest atomic formula 
i/)(lssh(f'")) enclosing lssh(f'"), and replace ip{t'~) with 



3^^ 



t^A'i^iu) 



u = d„sii. ru|(u,u) 



(96) 



This transformation is valid by Lemma [TO] because tp and 
Issh are strict. 
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We further assume that in every atomic formula t\ —^ ±2, 
the term ii is a leafset variable. 

Suppose that a term t in a formula u'" —^ t'~ is not an 
instance of an inner formula. Then there are two possibili- 
ties. 

1. There are some occurrences of leafset term algebra op- 
erations /'", fi or leafset variables u\ in t'". Here by 
"occurrence" in t'" we mean occurrence that is reachable 
without going through a shape argument or a relation, 
but only through operations V'_,3'_, a'_,V_,-'_. For ex- 
ample, we ignore the occurrences of /'", fl within terms 
t^ that occur in a\s. 

2. not all shape arguments in V'_,3'_, a'_,v'_,^'_, 
true'_,false'_, r_ occurring in f'" are syntactically iden- 
tical. 

We eliminate the first possibility by propagating leafset 
term algebra operations / , fl; inwards until they reach ex- 
pressions of form Lf{ti, . . . ,tn), applying the equations ( 94 1 
from left to right. We then convert / , /| operations of 
term algebra of le afset s into operations of the term algebra 
of terms applying (931 from right to left. 

To eliminate the second possibility, let f| , . . . , i^ be the 
occurrences (reachable through true'^,false'^, a'^, v'^, —>'_, 
V'_, 3'_) in term f'" of the shape arguments of operations 
tme'_,false'_, a'_, v'_, ^'_, V'_, 3[. Then replace 



t (ti, . . . t,i) 



with 



(3V. V?'-(u= == t\) A' ... A' V^.'-(u= == f„)A' 

u'- ='-t'-{u',...,u')) V 
(undef A Vi<,<,<„t?/^5) 

Here V^''" denotes universal quantification Viti^i, . . . , Wi,„; 
where Ui^i, . . . ,Ui^„^ is a list of those term variables occur- 
ring in tl that are bound by some quantifier 3'_, V'_ within t'". 



6.4 Quantifier Elimination 

In this section we give a quantifier elimination procedure for 
the term-power structure. The procedure of this section is 
applicable whenever C is a structure with a decidable first- 
order theory. 

Definition [78] below ge nera lizes the notion of structural 
base formula of Definition |41| Section |4.3| There are two 
main differences between Definition |4l] and the present Def- 
inition [78l 

The first difference is the presence of three (instead of 
two) base formulas: shape base, leafset base, and term base. 
This difference is a consequence of the distinction between 
leafsets and terms and is needed whenever base structure 
C has more than two elements. There is a homomorphism 
formula relating leafset base formula to shape base formula 
and a homomorphism formula relating term base formula to 
shape base formula. Furthermore, some of the leafset vari- 
ables are determined by term variables using inner formula 
maps, which establishes the relationship between term base 
formula and leafset base formula. Cardinality constraints 
now apply to leafset variables. 



The second difference is the distinction between com- 
posed and primitive non-parameter leafset and term vari- 
ables. A composed non-parameter variable denotes a leafset 
or a term whose shape s has property Is/s (s) for some / G S. 
A primitive non-parameter variable denotes a leafset or a 
term whose shape is c' and has property Ispri or lspR|L. The 
purpose of this distinction is to allow cardinality constraints 
and inner formula maps not only on parameter variables, 
but also on primitive non-parameter variables, which is use- 
ful when the base structure C is decidable but infinite. 

Definition 78 (Structural Base Formula) 

A structural base formula with: 

• free term variables Xi, . . . , x^; 

• internal composed non-parameter term variables 
Ul, . . . ,Ur; 

• internal primitive non-parameter term variables 
Ur+l, ■ ■ . ,Up; 

• internal parameter term variables lip+i, . . . , iip+<j; 

• free leafset variables x\, . . . , x'^^i; 

• internal composed non-parameter leafset variables 
iti , • ■ • , w^L / 

• internal primitive non-parameter leafset variables 

• internal parameter leafset variables u l^x, . . . , u ij^ l; 

• free shape variables x\, . . . , x\^s ; 

• internal non-parameter shape variables u\, . . . , u^s ; 

• internal parameter shape variables Ups, . . . , Ups^^s 
is a formula of form: 

3ui,. ..,Un,u[,... ,m^l,m'i, . . . ,U^s. 
shapeBase(ii|, . . . ,Uns,x\, . . . , x%^s) A 
leafsetBase(ui,. . . ,u^l,x\, . . . ,x'^^t) A 
leafsetHom(u'[, . . . ,M^L,^'i, . . . , Uns) A 
termBase(iii, . . . ,u„, xi, . . . , a;m) A 
term Horn (ui, ... ,u„,iti, ..., ij^js) A 

cardin(it^L+i, . . . , u^L.tt^^+i, . . . , <s) A 



innerMap(u,.+i, . . . ,u„,'u^l+i, . . . , w^ij-Up^+i, . • . , u^s) 



where n = p -\- q, n'" — p'~ + q'~, rv' — p^ -\- q^ , and formu- 
las shapeBase, leafsetBase, termBase, leafsetHom, termHom, 
cardin, innerMap are defined as follows. 

shapeBase(ui, . . . ,it^s, x\, . . . , x\^s) = 

/\ ul=ti{u\,...,u''„s) A /\ xl^ u]^ 

i = l i=l 

A distinct(?ii, . . . , u^) 

where each ti is a shape term of form f^{u\^, . . . ,u\^) for 
some f G Eo, k — ar(/), andj : {1, . . . , m^} -^ {1, . . . , n^} is 
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a function mapping indices of free shape variables to indices 
of internal shape variables. 

leafsetBase(ui,. . . , m^l, a;i, • • • ,x\^l) = 

i = l 
A ISpR|L«) A 



A 



x\ = u). 



where each ti is a term of form f(ui-^, . . . ,Uii_) for some 
f € Ti, k = ar(/), and j : {1, . . . , m'"} -^ {1, . . . , n'"} is a 
function mapping indices of free leaf set variables to indices 
of internal leafset variables. 

leafsetHom(u'[, . . . ,ii^L,'"i, • • • ,MnO = A Issh(iii) = u^. 

! = 1 

where j : {1, . . . , n'"} — > {1, . . . , n^} is some function such 
that {ji,...,jp} C {l,...,p'} and {jj,L+i,...,ipL+,L} C 
{p^ + 1, . . . ,p^ + g^} (a leafset variable is a parameter variable 
iff its shape is a parameter shape variable). 

termBase(ui,. . . ,u„,xi,. . . ,x„) = 

r 

/\ Ui = ti{ui, . . . ,u„) A 
1=1 
p 
A lspRi(u,) A 

m 

A ^i= '^ii 

1=1 

where each ti is a term of form f(ui^,...,Ui^') for some 
f G T:, k = ar(/), and j : {l,...,m} -^ {l,...,n} is a 
function mapping indices of free term variables to indices of 
internal term variables. 

n 

termHom(ui, . . . , u„,u\, . . . , u^s) = A sh(ui) — u'j. 

i=l 

where j : {l,...,n} — + {l,...,n^} is some function such 
that {jl, . . . , jp} C {1, . . . ,p"} and {jp+i, ■ ■ -Jp+q} C {p" + 
1, . . . jp'^ + q^} (a term variable is a parameter variable iff its 
shape is a parameter shape variable). 

cardin(u^L_,_i,. . . j-u^l, Ups+i, . . . ,u„s) ^ipi A- ■ ■ Aipd 
where each ipi is of form 

or 

\t'-{u';.L+i,...,u^„L)\u^ > k 

for some u^-term f'"(iiJ",L , j, . . . ,u^jl) that contains no vari- 
ables other than some of the variables ii'"i ,,,... ,tt'"i, and 

•f r'--t-l ' ' n"- ' 

the following condition holds: 

If a variable u^ for r'" + 1 < j < n'" occurs in 

the term f'"(u^L , j, ■ • ■ , w^l), then Issh(uj) = (97) 

u'^ occurs m formula leafsetHom. 



innerMap(iir+i, . 

771 A ■ ■ ■ A 77e 



, M^L, WpS + 1, 



, M?,s ) = 



where each rji is of form 

Uj = (f> {u,Ui^, . ..,Ui^) 

for some inner formula (f>^ {u^,Ui^, . . . yUi,,) G Inner(ii^) 
where L + 1 < j < n^ i.e. u\j is a primitive 
non-parameter leafset variable or parameter leafset vari- 



able, {Ui^ , 



J C {Ur+l,. 



1} are primitive non- 



parameter term variables and parameter variables, the con- 
junct lssh(u'") — u^ occurs m leafsetHom, and the following 
condition holds: 



s\\{ui-) = u^ occurs in formula termHom fo 
every j where 1 < j < k. 



(98) 



We require each structural base formula to satisfy the fol- 
lowing conditions: 

PO) the graph associated with shape base formula 



3u\,. 



shapeBase(ui, 



, Uj^s , Xi, . 



is acyclic (compare to Definition 21 ) 



PI) congruence closure property for shapeBase subformula: 
there are no two distinct variables u] and u^j such that 



both u\ = f{u\^ ,...,u\) and u] = f{u\^ , 
as conjuncts in formula shapeBase; 



, «;. occur 



P2) congruence closure property for leafsetBase subformula: 
there are no two distinct variables u\ and u\j such that 
both u\ = /'-(uj-j , . . . , u\^ ) and u) = f^{u\^ ,...,u\^) 
occur as conjuncts in formula leafsetBase; 

P3) congruence closure property for term Base subformula: 
there are no two distinct variables Ui and Uj such that 
both Ui = /(mij ,... ,uii.) and Uj = f{ui-^ ,... ,ui^) occur 
as conjuncts in formula termBase; 

P4) homomorphism property of Issh; for every non- 
parameter leafset variable u such that u = 
/'"(uij, . . . ,u^i_) occurs in leafsetBase, if conjunct 
lssh(w'") — u^ occurs in leafsetHom, then for some shape 
variables uj^ , • ■ • , itji. is-rm u^ — f^{u^j^ , ■ • • , n^j^ ) occurs 
in shapeBase where f = shapified(/) and for every r 
where 1 < r < fc, conjunct \ssb{ui^,) = itj^, occurs in 
leafsetHom. 

P5) homomorphism property o/sh; for every non-parameter 
term variable u such that u — f{uij^, . . . ,Uii.) oc- 
curs in termBase, if conjunct sh{u) — u^ occurs in 
termHom, then for some shape variables Uj^,...,Mj^ 
term u^ — f'{u^j^, . . . ,u'j^) occurs in shapeBase where 
f = shapified(/) and for every r where 1 < r < k, 
conjunct sh(iti^) = m^^ occurs in termHom. 



As in Section f3.4| and Section f4.3| we proceed to show that 
each quantifier-free formula can be written as a disjunction 
of base formulas and each base formula can be written as 
a quantifier-free formula. We first give a small example to 
illustrate how the techniques of Section |4.3| extend to the 
more general case of E-term-power. 
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Example 79 We solve one subproblem from Example [42] 
using the language of term-power algebras. 
Consider the formula 

3v. g{v,z) < g{z,v) A \sg{v) A lsg('u;) A 

^(51 (™) <9i(«)) 

Formula ( |99[ ) is in the language of Fig ure [8J with < a binary 
lifted relation. After converting ( |99[ ) into the language of 
Figure [9] we obtain as one of the possible cases formula: 

3v. [g{v,z) ^g{z,v)U(g(z,v)) ='" truel^h{g(^_„)) A 

sh{g{z,v)) ==sh(g(2:,«)) A 

\sg{v) A \sg{w) A (100) 

[gi{w) < gi{v)]sh(gi(vj)) /'" truel^h(g^(„)) A 

sh(5iW)=^sh(5iH) 

where ^ is the subtyping relation on the base structure C so 
that < = ^'. We next transform the formula into unnested 
form, obtaining: 

u^z = g{v,z) A Uzv = g{z,v) A 
Uiui = gi{w) A Uyi = gi{v) A 
uj^ =' sh{uvz) A uLi =' sh{un,i) A 
sh(M2„) =^ u^,^ A sh(tt„i) =' u^i A 
\Sg(v) A Is9(to) a 

I ^''^iiz I ii'|,j, U /\ 



(101) 



We next transform ( 101 \ into disjunction of base formulas. 
A typical base formula is: 

3Uvz , Uzv , Uv , Uz , Uni , Uvl , Uv2 , Uzl , Uz2 ,Ujnl,Uw2- 



3'u. 



VZ ? ^D 5 ^2 ! Uyl , Uy2 5 "^21 5 ^22 5 ^ItJl 



^2 5 '^TJJ 5 '^Tijl: "'If 2 ■ 



shapeBasBj^ A 
leafsetBasei A leafsetHomi A 
termBasei A termHortii A 
cardirii A innerMapj^ 

shapeBase^ = ul^ = g^{ul,,ul,) Aul, = p'(u^i,m'„2) A 

distinct«^,it^,w^i,u^2) 
leafsetBasei = u\,2 = g'~{u\,,u^) A 

Uv = g^iuvi,u\,2) Ait^ = g'-(M^i,u^2) 
leafsetHomi — lssh(ii^2) = ul^ A 

Issh(M^) = u^ A lssh(M';) — w^, A 

Issh(w^i) = ul,i A lssh(u^2) = u%^2 A 

Issh(u^i) = u^i A lssh(u^2) = -"1,2 A 

Issh(u^i) = ut„i 



(102) 



termBasei = u^z = ^(wu, W2) A Uzv — g{uz,Uy) A 

Uv = giUvl,Uy2) AUz = g{Uzl,Uz2) A 
Un, = g(w„i,«u,2) A 
Z = U2 A U) = Mm 

termHomi = 

sh(M„2) = M^z A sh(M2„) = M^2 A 

sh{uv) = u%j A sh(M2) — u%j A sh(Mu.>) = u\^ A 

sh(M„i) = it^i A sh(u2i) = it^i A sh(Mu,i) = m^i A 

sh(Mi,2) = u\,2 A Sh(u22) = u\j2 a sh('Uu,2) = U%j2 

innerMapj^ = 

ubi ='" [ut,i ^ "2i]us^j A Mzi ='" [U2I -< Uvi\u''^^ A 

■"b2 ='" [Uv2 < Uz2]ul^^ A Mz2 ='" ["22 ^ "i^alu^^^ A 
Uujl = [MtoI ^ Mi,l]uS^j 

cardini = hu^iU^^j = A h-Uzi|„s^^ = A 

h'"^2i<2 = A hM^2U=„2 ^ ^ 
hMbn lu^ , I > 1 



We next show how to transform the base formula ( 102 \ into 
quantifier-free form. 

We substitute away non-parameter term variables 
UvzjUzvyy-v and non-parameter leafset variables «'^2>w^,u^, 
because the homomorphism constraints they participate in 
may be derived from the remaining conjuncts. We next elim- 
inate parameter term variables ■u„i,tt„2 and parameter leaf- 
set variables «^i,«^2)^zii''22i w^i- Grouping the conjuncts 
in cardini and innerMapj^ b y the ir shape, we may extract the 
subformulas V'l and Tp2 of ( |102[ ). 

V'l = 

dUvi.d li^;^, li^i, M^x- 
sh(M„i) =' u^i A sh(u2i) ="" -uLi A sh(M„i) ='' w^i A 
Issh(it'^i) ^'^ u^ji A Issh(ii^i) =^ ii^i A 
lssh(u';-„i) =' ul,i A 
u[,i ='- [u„i < Uzi]ui^^ ^ "^1 ^'~ ["^1 - ""i]<,i ^ 

l^u'iiy^^ =0 A hu^ilus^^ =0A 
h^Ljus^J > 1 
and 

i>2 = 
3Uv2-3'-u'^2,'U,z2- 
Sh(u„2) ='' ul,2 A Sh(?iz2) ="" "^2 A 
sh(u^2) =' ^^2 A sh(Mz2) =" ul,2 A 
Uv2 ='" [«i.2 :< Uz2]ul^2 ^ ""^2 ='" [W22 :< ^'!'2]ns„2 A 
hu^2\ul^ = A hu22l<2 ^ 

Formula i/)i expresses a fact in a structure isomorphic to 
the power C" where n is the number of leaves in the shape 
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denoted by ii^i- Similarly, 1/12 expresses a fact in a prod- 
uct structure C™ where m is the number of leaves in the 
shape denoted by 11^2- We can theref ore u se the technique 
of Feferman-Vaught technique (Section 3.31 to eliminate the 
quantifiers from formulas tpi and tp2- According to Exam- 
ple [17] V"! is equivalent to: 

Uq ='" [3't. t :< u^i a' Uzi ^ i a' u^i :< t]ui^^_^ A 



U4 



='- [3't. t ^ M^i a' u^i ^ i a' ^'mii,! ^ i]„s_^^ A 



1m4|<,i > 1 A h'u|5 A 



' -'^4l 







We similarly apply Feferman-Vaught construction to tp2 and 
obtain the result true. We may now substitute the results of 
quantifier elimination in -i/;! and ^p2- The resulting formula 
is: 

3Uvz , Uzv ,Uv,Uz,Uiu,Uvl, Uv2 ,Uzl, Uz2 , Uml , Wu,2 ■ 
d tltiz, U„, M^, M„i, M„25 ^zl, Uz2i W„i. 



shapeBasBj A 
leafsetHom2 A 
termBase2 A termHom2 A 
cardini A innerMapj 

where 

leafsetHom2 = Issh(Mo) = u^j A lssh(M4) = u^^i 

termBase2 = Uz = g{uzi,Uz2) A u,^ = g{uw\,Uw2) /\ 
z = Uz A w = U^u 
innerMap2 — 

Uq ='" [3'i. t :< Uzi a' Uzi ^ i a' u^i < t]ui^^ A 
U4 ='- [3't. t ^ Uzi a' Uzi < t a' ^'m»i ^ i]„s^ .^ A 



cardin2 



l-ukl^,! > 1 A h'uj] a' ^'«4l<,i =0 



In the resulting formula all variables are expressible in terms 
of free variables, so we can write the formula without quan- 
tifiers 3,V.3'-,V'-. 



The following Proposition [8O] is analogous to Proposi- 
tion [44] the proof is straightforward. 

Proposition 80 (Quantification of Struct. Base) If l3 

is a structural base formula and x a free shape, leafset, or 
term variable in 13, then there exists a base structural for- 
mula /3i equivalent to 3x./3. 

The following Proposition|81|corresponds to Proposition|45[ 



Proposition 81 (Quantifier-Free to Structural Base) 

Let S be a well-defined simple formula without quantifiers 
3'",V , 3,V, 3^,V^. Then 4> can be written as true, false, or 
a disjunction of structural base formulas. 



Proof Sketch. The overall idea of the transformation to 
base formula is similar to the transformation in the proof of 
Proposition |45| Additional complexity is due to inner formu- 
las. However, note that an inner formula (/>(u'^,«i, . . . ,u„) 
is well-defined iff (5(m^ ui, . . . , Un) holds where 

5{u^,ui, . . . ,u„) = sh(iti) = M^ A... A sh{un) = u^ 

Hence, each formula <j!>(u'^,mi, . . . ,it„) can be treated as a 
partial operation p of sort 

shape X term" -^ leafset 

and the domain given by 

Dp = {{u,ui,... ,u„),S{u,ui,. . . ,u„)) 

This means that we may apply Proposition [9] and convert 
formula to disjunction existentially quantified well-defined 
conjunctions of literals in one of the following forms: 

1. equality with inner formulas: Ug ='" cj>{u^,ui, . . . ,Un) 
where 0(u% ui, . . . ,Un) is a u^-inner formula; 

2. formulas of leafset boolean algebra: 

Uq ='- u[ A^ M2 
Uo ='- u[ V^ «2 

Uo ='- ^i,sUi 

Uo ='- truet,s 
Uq =~ false^s 

3. formulas of term algebra of terms: 

Ml = 112, Ul 7^ 112 
Mo = /(mi, • • ■ ,-"„) 
W = /i("o) 
ls/(llo), ^ls/(llo) 

sh(ii) = u' 

4. formulas of term algebra of leafsets: 

u[ ='- tl2, Ml /'■ M2 

u'^^^ f^{u[,...,u^„) 
U^ = ,f'(^o) 

IS/l(mo), ^ISy.L(Mo) 

lssh(u'-) ='- u" 

5. formulas of term algebra of shapes: 

u\ =' ti2, u\ /' ti| 
Wo =' ./"(wi,---,-"™) 

Is/^w^), ^Is/s(mo) 
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We next describe transformation of each existentially 
quantified conjunction. In the sequel, whenever we perform 
case analysis and generate a disjunction of conjunctions, ex- 
istential quantifiers propagate to the conjunctions, so we 
keep working with a existentially quantified conjunction. 
The existentially quantified variables will become internal 
variables of a structural base formula. 

Analogously to the proof of Proposition |28[ we use 
(90|, (IgTI, |l6| to eliminate literals -n\sf{uo), ^lsyL/'-(uo), 



,(uo 



As in the proof of Proposition |45| we replace formulas of 
leafset boolean algebra by cardinality constraints, similarly 
to Figure [7] 

We next convert formulas of term algebra of terms into 
a base formula, formulas of term algebra of leafsets into a 
base formula, and formulas of term algebra of shapes into a 
base formula. 

We simultaneously make sure that every term or leafset 
variable has an associated associated shape variable, intro- 
ducing new shape variables if needed. 

We also ensure homomorphism requirements by replac- 
ing internal variables when we entail their equality. 

Another condition we ensure is that parameter term vari- 
ables map to parameter shape variables, and non-parameter 
term variables to non-parameter shape variables; we do this 
by performing expansion of term and shape variables. 

We p erform expansion of shape variables as in Sec- 
tion |3.2| Expansion of term and variables is even simpler 
because there is no need to do case analysis on equality of 
term variable with other variables. 

We eliminate disequality between term variables us- 
ing ( 92 I . We eliminate disequalities between leafset vari- 



ables as in Example |43[ by converting each disequality into 
a cardinality constraint. Elimination of disequalities might 
violate previously established homomorphism invariants, so 
we may need to reestablish these invariants by repeating the 
previously described steps. The overall process terminates 
because we never introduce new inequalities between term 
or leafset variables. 

As a final step, we convert all cardinality con stra ints into 
constraints on parameter term variables, using ( 95 I . 



In the case wh en t he shape of cardinality constraint is c^, 
we cannot apply ( |95[ |. However, in this case, unlike Propo- 
sition [45] we do not do case analysis on all possible constant 
leafsets (this is not even possible in general) . This is because 
Definition [78] unlike Definition [41] implies no need to further 
decompose cardinality constraints in that case, because we 
allow primitive non-parameter leafset variables. 

This completes our sketch of transforming a quantifier- 
free formula into disjunction of structural base formulas. ■ 

We introduce the notion of determined variables in struc- 
tural base formula generalizing Definition [29] and Defini- 
tion [H 

For brevity, we write u* for internal shape, term, or leaf- 
set variables, similarly x* for a free variable, t* for a term 
and /* for a shape, term, or leafset term algebra constructor 
and /* for a shape, term, or leafset term algebra selector. 

Definition 82 The set determinations of variable determi- 
nations of a structural base formula 13 is the least set S of 
pairs {u*,t*) where u* is an internal term, leafset, or shape 
variable and t* is a term over the free variables of (3, such 
such that: 



1. if X* — u* occurs in termBase, leafsetBase, or 
shapeBase, then {u*,x*) G 5*; 

2. if {u*,t*) G S and u* — f* [ui, . . . ,u1) oc- 
curs m shapeBase, termBase, or leafsetBase then 
{ ("1,. /r (i* )),•••, K, A* r )) }^S; 

3- tf {K,/r(i*)),---,(w*,A*(i*))} Q S and u* = 
f*{ui, . . . ,ul.) occurs in shapeBase, termBase, or 
leafsetBase then {u* ,t*) G S; 

4- if {u,t) G S and sh{u) = u^ occurs in termHom then 
{M^sh(i)) G S; 

5. if {v!~,t^) G S and lssh(u'") — u^ occurs in leafsetHom 
then (M",lssh(i'-)) G S; 

6. if v!~ — (f){u^,ui, . . . ,u„) occurs in innerMap 
where (j){u^ , ui , . . . , u„) is an inner formula 
and {{u^,f),{ui,ti),.. .,{u„,t„)} C S, then 
{u ,<j){t^,ti, . . . ,tn)) G S. (In the special case 
when 4> contains no free term variables, if {u^,t^) G S 
then {u'- , (l>{u'')) G S. 

Definition 83 An internal variable u* is determined if 
{u* ,t*) G determinations for some term t^. An internal vari- 
able IS undetermined if it is not determined. 

Lemma 84 Let (3 be a structural base formula with ma- 
trix Po and let determinations be the determinations of 13. If 
{u*,t*) G S then \= f3o ^ u* ^ t* . 

Proof. By induction, using Definition |82[ ■ 

Corollary 85 Let (3 be a structural base formula such that 
every internal variable is determined. Then f3 is equiva- 
lent to a well-defined formula without quantifiers 3 ,V , 3, V, 

3^V^ 



Proof. By Lemma 84 using n\ 



Lemma 86 Let u be an undetermined composed non- 
parameter term variable in a structural base formula [3 such 
that u is a source i.e. no conjunct of form 



u = f{ui, 



,Uk 



occurs in termBase. Let (3' be the result of dropping u from 
(3. Then (3 is equivalent to (3 . 

Proof. Because « is a composed non-parameter term 
variable, it does not occur in innerMap, so it only occurs 
in termBase and termHom. The conjunct containing u in 
termHom is a consequence of the remaining conjuncts, so it 
may be dropped. After that, applying ([7| yields a structural 
base formula /3' not containing u, where /3' is equivalent to 
p. , 

Lemma 87 Let u'" be an undetermined composed non- 
parameter leafset variable in a structural base formula f3 such 
that u'" is a source i.e. no conjunct of form 



L' j-L/ L L 

u = J (Ui,...,U , 



■,Uk) 



occurs in leafsetBase. Let (3' be the result of dropping u 
from p. Then /3 is equivalent to P' . 
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Proof. Because u is a composed non-parameter term 
variable, it does not occur in innerMap or cardin, so it only 
occurs in leafsetBase and leafsetHom. The conjunct con- 
taining u'" in leafsetHom is a consequence of the remaining 
conjuncts, so it may be dropped. After that, applying (ml 
yields a structural base formula /3' not containing u'~, where 
/3' is equivalent to f3. m 

Corollary 88 Every base formula is equivalent to a base 
formula without undetermined composed non-parameter 
term variables and without undetermined composed non- 
parameter leafset variables. 

Proof. If a structural base formula has an undetermined 
composed non-parameter term variable, then it has an un- 
determined composed non-parameter term variable that is 
a source, similarly for leafset variables. By repeated appli- 
cation of Lemma |86] and Lemma |87] we eliminate all unde- 
termined non-parameter term and leafset variables. ■ 



The following Proposition 
tion|53|and Proposition |66| 



corresponds to Proposi- 



Proposition 89 (Struct. Base to Quantifier-Free) 

Every structural base formula (3 is equivalent to a well- 
defined simple formula 4> without quantifiers 3'",V'", 3,V, 

Proof Sketch. By Corollary [88] we may assume that 
/3 has no undetermined composed non-parameter term and 
leafset variables. By Corollary |85| we are done if there are 
no undetermined variables, so it suffices to eliminate: 

1. undetermined parameter term variables, 

2. undetermined primitive non-parameter term variables, 

3. undetermined parameter leafset variables, 

4. undetermined primitive non-parameter leafset vari- 
ables, and 

5. undetermined shape variables. 

If u is an undetermined parameter term variable or a prim- 
itive non-parameter term variable, then u does not occur in 
termBase, so it occurs only in termHom and innerMap. If 
w"" is an undetermined parameter leafset variable or a prim- 
itive non-parameter leafset variable then ■u'" does not occur 
in leafsetBase, so it occurs only in leafsetHom, innerMap, and 
cardin. 

For a undetermined term or leafset variable of shape u' 
such that there is an uncovered parameter or primitive non- 
parameter term or leafset variable with shape u^, consider 
all conjuncts 7i in innerMap of form 

Uj = (j){u,Ui^, . .. ,Ui^) 

and all conjuncts Si from cardin of form: 

Ij-L/ L L M , 

\t (m,,l+i,...,u„l)U» = k 



|i'K^ 



,"„l)U» > k 



Together with formulas from termHom and leafsetHom that 
contain term and leafset variables free in formulas 7i and Si, 
these conjuncts form a formula rj which expresses a relation 



in the substructure of term-power algebra which (because 
constructors are covariant) is isomorphic to a term-power 
of C. We therefore use Feferman-Vaught theorem from Sec- 
tion [33] to eliminate all term and parameter variables from 
ri. By repeating this process we eliminate all undetermined 
parameter and leafset variables. 

It remains to eliminate undetermined shape variables. 
This process is similar to term algebra quantifier elimina- 
tion in Section |3.4| An essential part of construction in 
Section [3. 4| is Lemma [25] which relies on the fact that unde- 
termined parameter variables may take on infinitely many 
values. We therefore ensure that undetermined parameter 
shape variables are not constrained by term and parame- 
ter variables through conjuncts outside shapeBase. An un- 
determined parameter shape variable u^ does not occur in 
termHom or leafsetHom because there are no parameter term 
and leafset variables, so u'^ can occur only in innerMap and 
cardin. 

However, because undetermined parameter and leafset 
variables are eliminated from the formula, if u' is a parame- 
ter shape variable then exactly one of these two cases holds: 

L there are some conjuncts in innerMap and cardin that 
contain u^ and contain some determined term and leaf- 
set variables, in this case it^ is determined, or 

2. there are no conjuncts in innerMap containing u^ and 
cardin contains only domain cardinality constraints of 
form |l|us — k and jl|us > k. 

Hence, if u' is a shape variable it remains to eliminate the 
constraints of form |l|„s = k and |l|„s > k. We eliminate 
these constraints as in the proof of Proposition |66| 

In the res ulting formula all variables are determined. By 
CoroUaryJSSJthe formula can be written as a formula without 
quantifiers B^VS 3,V, 3',V'. ■ 

The following is the main result of this paper. 

Theorem 90 (Term Power Quant. Elimination) 

There exist algorithms A, B such that for a given formula 
4> in the language of Figure Wt 

a) A produces a quantifier-free formula 4> in selector lan- 
guage 

b) B produces a disjunction (j) of structural base formulas 
We also explicitly state the following corollary. 

Corollary 91 LetC be a structure with decidable first- order 
theory. Then the set of true sentences in the logic of Figurel^ 
interpreted in the structure V according to Figures | J0| and 
\11\ is decidable. 

6.5 Handling Contravariant Constructors 

In this section we discuss the decidability of the E-term- 
power structure for a decidable theory C when some of the 
function symbols / G E are contravariant. We then sug- 
gest a generalization of the notion of variance to multiple 
relations and to relations with arity greater than two. 

The modifications needed to accommodate contravari- 
ance with respect to some distinguished relation symbol 
<G R for the case of infinite C are analogous to the modifi- 
cations in Section [5. 5| We this obtain a quantifier elimina- 
tion procedure for any decidable theory C in the presence of 
contravariant constructors. 
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Theorem 92 (Decidability of Structural Subtyping) 

Let C be a decidable structure and V a Ti-term-power of C. 
Then the first-order theory ofV is decidable. 

In the rest of this section we consider a generalization 
that allows defining variance for every relation symbol r £ R 
of any arity, and not just the relation symbol <G R. 

For a given relation symbol r £ R, function symbol 
f £ T,, with k = ar(/), and integer i where 1 < i < k, 
let Pr{f,i) denote a permutation of the set {1, . . . ,k} that 
specifies the variance of the i-th argument of / with respect 
to the relation r. For example, if r is a binary relation then 
Pr{f,i) is the identity permutation {(1,1)(2,2)} if i-th ar- 
gument of / is covariant, or a the transpose permutation 
{{1, 2), (2, 1)} if i-th argument of / is contravariant. 

If / e leaves(s) is a leaf / = {f, i^) . . . {/", i"), define the 
permutation variance(/) as the composition of permutations: 



variance(Z) = Pr(/" 



■oPrif,^ 



Then define Jr] by 

H(s,fi,...,tfc) = 



}) 



{pi,. . . ,Pk) = variance(/) 



We generalize ( 76 1 by defining 

-'Vir(s) — \{l G leaves(s) j variance(Z) = n}\ 

As in Section |5.5[ we can transform the constraints 
|l|uii ~ k and |l|us > k on each parameter shape variable 
into a conjunction of constraints of form: 

N^{u) = k 

or 

N^{u) > k 

A problem on nonnegative integers. To solve the 
problem of variance with any number of relation symbols of 
any arity, it suffices to solve the following problem on sets 
of tuples of non-negative integers. 

Let Nat — {0,1,2,...}. Consider the structure St — 
Nat'' for some d > 2 and let D = {1,2,..., d}. If p is a 
permutation on D, let Mp denote an operation St -^ St 
defined by 

lVIp[X\, . . . , Xd) ^= [Xp-^ , ■ • ■ , ^Pd) 

li {xi,... ,Xd},{yi,- ■ ■ ,yd} G St define 

{xi,...,Xd) + {yi,...,yd) = {xi +yi,. .. ,Xd + yd) 

Consider a finite set of operations / : St*^ -^ St where each 
operation / is determined by k permutations p( , . . . , p^ in 
the following way: 



/(ii 



,tk) = Mj{t^) + ... + Mf{tk) 



Hence, each operation / of arity k is given by a permutation 
which specifies how to exchange the order of arguments in 
the tuple. After permuting the arguments the tuples are 
summed up. 



Given a finite set F of operations /, let S be the set 
generated by operations in F starting from the element 
(1,0, ...,0) G St. Let C(n\, . . . yUd) be a conjunction of 
simple linear constraints of the forms 



and 



Consider the set 



Ui > at 



Ac = {(ni, • ■ ■ ,nd) e S\C{ni,. . . ,nd)} 

The problem is: For given set of operations F, is there an 

algorithm that given C(ni, . . . , rid) finitely computes the set 

Ac- 

End of a problem on nonnegative integers. 

We conjecture that the technique of Lemma |68] can be 
generalized to yield a solution to the problem on nonnegative 
integers and thus establish the decidability for the notion of 
variance with respect to any number of relations with any 
number of arguments. 

6.6 A Note on Element Selection 

We make a brief note related to the choice of the language 
for making statements in term-power algebras. In Section [5] 
we avoided the use of leafset variables by substituting them 
into cardinality constraints. In this section we use a cylindric 
algebra of leafsets. 

An apparently even more flexible alternative is to allow 
the element selection operation 

select :: term x leaf -^ elem 

where elem is a new sort, interpreted over the set C, and 
leaf is a sort interpreted over the set of pairs of a shape and 
a leaf. Instead of the formula 

r„s(ti, . . . ,i„) ='- true^js 

we would then write 

V/. rus(select(fi,/), . . . ,select(i„, Z)) — true^s 

Using select operation we can define update relation: 

update(ti,Zo,e,t2) = 
VZ. {{I = lo A select(i2,0 = e) V 

{Ij^lo A select(t2,0 = select(ti,/))) 

The resulting language is at least as expressive as the lan- 
guage in Figure [5] This language is interesting because it 
allows reasoning about updates to leaves of a tree of fixed 
shape, thus generalizing the theory of updatable arrays [33] 
to the theory of trees with update operations, which would 
be useful for program verification. We did not choose this 
more expressive language in this report for the following 
reason. 

If the base structure C has a finite domain C, then for 
certain reasonable choice of the relations interpreting Lc it 
is possible to express statements of this extended language 
in the logic of Figure [9] The idea is to assume a partial order 
on the elements of C with a minimal element, and use terms 
t with exactly one leaf non-minimal to model the leaves. 
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On the other hand, in the more interesting case when C 
is infinite, we can easily obtain undecidable theories in the 
presence of selection operation. Namely, the selection oper- 
ation allows terms to be used as finite sets of elements of C. 
The term-power therefore increases the expressiveness from 
the first-order theory to the weak monadic second-order the- 
ory, which allows quantification over finite sets of objects. 
Weak monadic theory allows in particular inductive defini- 
tions. If theory of structure C is decidable, weak monadic 
theory might therefore still be undecidable, as an example 
we might take the term algebra itself, whose weak monadic 
theory would allow defining subterm relation, yielding an 
undecidable theory j561 Page 508]. 

7 Some Connections with MSOL 

This section explores some relationships between the the- 
ory of structural subtyping and monadic second-order logic 
(MSOL) interpreted over tree-like structures. We present 
it as a series of remarks that are potentially useful for un- 
derstanding the first-order theory of structural subtyping of 
recursive types, see |36II37| for similar results in the context 
of the theory of feature trees. 

In Section |7.1| we exhibit an embedding of MSOL of in- 
finite binary tree into the first-order theory of structural 
subtyping of recursive types with two constant symbols a,b 
and one covariant binary function symbol /. MSOL of infi- 
nite binary tree is decidable. Although the embedding does 
not give an answer to the decidability of the structural sub- 
typing of recursive types, it does show that the problem is at 
least as difficult as decidability of MSOL over infinite trees. 
We therefore expect that, if the theory of structural subtyp- 
ing of recursive types is decidable, the decidability proof will 
likely either use decidability of MSOL over infinite trees, or 
use directly techniques similar to those of [181157] . 

In Section |7.2| we use the embedding in Section |7.1| to 
argue the decidability of formulas of the first-order theory 
of structural subtyping of recursive types where variables 
range over terms of certain fixed infinite shape Sg. 

In Section [7. 3| we present an encoding of all terms using 
terms of shape Se. We argue that the main obstacle in us- 
ing this encoding to show the decidability of the first-order 
theory of structural subtyping recursive types is inability to 
define the set of all prefix-closed terms of the shape Sg. 

In Section prTil we generalize the decidability result of Sec- 
tion |7.2| by allowing different variables to range over different 
constant shapes. 

In Section 17.51 we illustrate some of the difficulties in 
reducing first-order theory of structural subtyping to MSOL 
over tree-like structures. We show that if we use a certain 
form of infinite feature trees instead of infinite terms, the 
decidability fo llow s. 

In Section |7.6| we point out that monadic second-order 
logic with prefix-closed sets is undecidable, which follows 
from [48|. This fact indicates that if we hope to show the 
decidability of structural subtyping of recursive types, it is 
essential to maintain the incomparability of types of different 
shape. 

7.1 Structural Subtyping Recursive Types 

In this section we define the problem of structural sub- 
typing of recursive types. We then give an embedding of 
MSOL of the infinite binary tree into the first-order theory 



of structural subtyping of infinite terms over the signature 
S = {a, b, g} with the partial order <. 

We define MSOL over infinite binary tree [H] Page 317] 
as the structure MSOL"^ = ({0, l}*,succo,succi). The do- 
main of the structure is the set {0, 1}* of all finite strings 
over the alphabet {0, 1}. We denote first-order variables by 
lowercase letters such as x,y,z. First-order variables range 
over finite words w £ {0, 1}*. We denote second-order vari- 
ables by uppercase letters such as X, Y, Z. Second-order 
variables range over finite and infinite subsets S C {0, 1}*. 
The only relational symbol is equality, with the standard in- 
terpretation. There are two function symbols, denoting the 
appending of the symbol and the appending of the symbol 
1 to a word: 

succo w — w ■ 

succi w — w ■ 1 

For the purpose of embedding into the first-order theory 
of structural subtyping, we consider a structure MSOL'^^ = 
({0, 1}*, ^, Succo, Succi) equivalent to MSOL'^'. We use the 
language of MSOL without first-order variables to make 
statements within MSOL*-^'. ^ is a binary relation on sets 
denoting the subset relation: 



YiQY2 



yx. X eYi^ X eY2 



Succo and Succi are binary relations on sets, Succo, Succi C 
2{o,i}* ^ 2^°-^^' , defined as follows: 



Succo (yi,F2) 
Succi(yi,F2) 



Y2 = {w-0\weYi} 
Y2 = {w-l\weYi} 



The structure MSOL'^' is similar to one in [18]; the dif- 
ference is that relations Succq and Succi are true even for 
non-singleton sets. 

Lemmas [93] and [94] show the expected equivalence of 
MS0L(2) and MSOL'^ 

Lemma 93 (MSOL*^' expresses MSOL^^') Every rela- 
tion on sets definable in MSOL'^' is definable in MSOL'-'^'. 

Proof. We express relations C, Succo, Succi as formulas in 
MS0L'^\ as follows. We express Vi C Y2 as 

Va;. Yi(x) => Fa (a;), 

Succo(yi,y2) as 

yx.Y2(x) <=> 3y.y = succo{x), 
and Succi(Yi,F2) as 

Wx.Y2(x) <=> 3y.y = succi{x). 

The statement follows by induction on the structure of for- 
mulas. ■ 

Let R C (2{0'i>')* X ({0, 1}*)" be relation of arity k -\- n. 
Define R* C (2{0'i>*)'= x (2<0'i>*)" by 

i?*(Yi,...,Yfc,Xi,...,X„) = 

3xi,...,Xn- Xi = {xi} A • ■ • A X„ = {a;„} A 

R{Yi,...,Yk,Xl,...,Xn) 
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Lemma 94 (MSOL*^' expresses MSOL^^^) If R is defin- 
able in MSOL'^', then R* is definable in MSOL^^^ 

Proof Sketch. Property of being an empty set is definable 
in MSOL'^' by the formula 



MYi] 



\/Y2.Yi g Yi 



The relation C of being a proper subset is definable in 
MSOL^i' by formula 

MYi^Yi) = Y^QY2^Yl^Y2 

and the relation Ci of having one element more is definable 
by formula 

02(^1,^2) = Fi C y2 A -^3Z. Y^CZ hZ (ZY2 

The property of being a singleton set can then be expressed 
by formula 

<l>3{Yi) = 3yo.<^o(yo)Al1, Ci Yi 

We define the relation on singletons corresponding to succo 
by 

My^^y^) = 03(yi)A03(y2)ASucco(yi,y2) 

Similarly, the relation corresponding to succi is defined by 

<t>5{Y^,Y2) = 03(yi)A03(y2)ASucci(yi,y2) 

If R is expressible by some formula ip in MSOL"', then R is 
expressible by a formula in prenex normal form, so suppose 
4j is of form 

QiVi...Q„K.'(/'o 

where t/^o is quantifier free. We construct a formula il)' ex- 
pressing R* in MSOL'^'. We obtain the matrix ijj'q of ip' 
by translating ipo as follows. If a; is a first-order variable in 
i/jQ, we represent it with a second-order variable X denot- 
ing a singleton set. We replace membership relation Y{x) 
with subset relation X (ZY . We replace succo with cj>4, and 
succi with 05. We construct ip' by adding quantifiers to 
i/)o as follows. Second-order quantifiers remain the same. 
First-order quantifiers are relativized to range over single- 
ton sets: Mx.xjji becomes VX. 03 (X) => ip[ and Bx.-i/'i becomes 
3X03(X)AVUX). . 

We can view MSOL''^' as a first-order structure with the 
domain 2*°'^^* . We show how to embed MSOL*^' into the 
first-order theory of structural subtyping. 

We define the first-order structure of structural subtyp- 
ing of recursive types similarly to the corresponding struc- 
ture for non-recursive types in Section H] the only difference 
is that the domain contains both finite and infinite terms. 
Infinite terms correspond to infinite trees |12,!:30i. 

We define infinite trees as follows. We use alphabet {I, r} 
to denote paths in the tree. A tree domain D is a finite or 
infinite subset of the set {l,r}* such that: 

1. D is prefix-closed: if w G {Z,r}*, x £ {l,r} then 
w ■ X £ D implies w £ D; 

2. ii w £ D then exactly one of the following two proper- 
ties hold: 

(a) w is an interior node: {w ■ 1,10 ■ r} (Z D 



(b) w is a leaf: {w ■ l,w ■ r} D = <ll. 

A tree with a tree domain _D is a total function T from the 
set of leaves of D to the set {a, b}. 

Note that the tree domain D of a tree T can be recon- 
structed from T as the prefix closure of the domain of the 
graph of function T; we write TDom(T) for the tree domain 
of tree T. 

Two trees are equal if they are equal as functions. Hence, 
equal trees have equal function domains and equal tree do- 
mains. 

We say that Ti < T2 iff TDom(Ti) = TDom(r2) and 
Ti{w) <o T2{'w) for every word w G TDom(ri). Here <o is 
the relation {{a,a),{a,b},{b,b)}. 

If Ti and T2 are trees, then g{T\,T2) denotes the tree T 
such that 

TDom(r) ^ {l-w\w £Tx}VJ{r-w\w eT2} 

T{l-w)^mw), iiweTi 

T{r-w)^T2{w), \iweT2 

Let IT denote the set of all infinite trees. The structural 
subtyping structure is the structure SIT = (IT, g, a, 6, <). 
SIT is an infinite-term counterpart to the structure BS from 
Section m 

Similarly to the case of finite terms, define the relation 
~ of "being of the same shape" in SIT by 

ti ~ f2 = 3io. to < il A to < t2 

Observe that ii ~ i2 iff TDom(fi) = TDom(t2). 

We next present an embedding l of MSOL*^^' into SIT. 
The image of the embedding t are the infinite trees that are 
in the same ~-equivalence-class with the tree te. We define 
ie as the unique solution of the equation: 

te = g{g{te,te),a) 

-equivalence class of ts have the tree domain 
given by the regular context-free grammar 



Trees in the -^ 
D = TDom(fe 



D 



r\l\lrD\ IID 



whereas the leaves L oi D are given by the context-free gram- 
mar 

L^ e\r\lrL\llL 

or the regular expression {lr\ll)*r. Let h be the homomor- 
phism of words from {0, 1}* to {/,r}* such that 

ft(0) = a 



Ml) 



Ir 



If w = ai . . . a„ is a word, then w^ denotes the reverse of 
the word, w'^ = a„ . . .ai. 

We define the embedding t to map a set Y C {0, 1}* into 
the unique tree t such that t ^ t^ and for every w G {0, 1}*, 



■w eY ^ 

Observe that t(0) = te- 
TSucci(ii, ^2) as follows 

TSucco(ti,i2) 

TSucci(fi,i2) 



=^ T{h{w") ■r) = b (103) 

Define formulas TSucco(ii,i2) and 

t2 = g(g{ti,te),te) 
^2 = g{g{te,ti),te) 



It is straightforward to show that t is an injection and that 
i maps relation C into <, relation Succo into TSuccq, and 
relation Succi into TSucci. Moreover, the range of t is the 
set of all terms t such that sh(i) = Se where Se = sh(ie). 
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7.2 A Decidable Substructure 

Section ["7.1| shows that terms of shape Se form a substructure 
within SIT that is isomorphic to MSOL'-'^-'. In this section 
we consider the foUowing converse problem. 

Consider the formulas BT that, instead of quantifiers 
3,V, contain bounded quantifiers 3e,Ve that range over the 
elements of the set 

Te = {t I sh(i) = s,} 

We show that the set of closed formulas from BJ- that are 
true in SIT is decidable. 

Although the quantifiers are bounded, terms in this logic 
can still denote elements of shape other than Se- For exam- 
ple, the in the atomic formula 

g{xx,X2) <9{x3,g{g{x4,X5),b)) 

the term g{xi,X2) denotes a term of the shape g^{se,Se). 
First we show that all atomic formulas are of one of the 
following forms: 

1. xo = g{g{xi,X2),a); 

2. Xo ^ g{g{xi,X2),b)- 

3. Xi — X2\ 

4. X\ < X2- 

Consider an atomic formula ti = t2- The key idea is that if 
sh(fi) 7^ sh(t2) then the formula t\ — t2 is false. 

If none of the term ii and f 2 is a variable then one of them 
is a constant or a constructor application. If f 1 = 3(^11,^12) 
then either ii — t2 is false or ^2 = (7(421,^22) for some £21, ^22- 
We may therefore decompose ti — f2 into fn = t2i and 
ii2 = t22- By repeating this decomposition we arrive at 
terms of form ti = i2 where both ti and t2 are constants or 
at the equality of form xq = t{xi, . . . , a;„). The equalities 
between the constants can be trivially evaluated. This leaves 
only terms of form xq — i(xi, . . . , Xn). Let t^{x\, . . . , x^„) be 
a shape term that results from replacing a and b with c' and 
replacing g with g^ in t. Because all variables range over Te, 
we conclude that xq = t{xi, . . . ,x„) can be true only if 



= e{s 



.) 



If t(xi, . . . ,Xn) G {a,b} is then (7.2l is false. If 
t{xi, . . . ,Xn) = a;i, we obtain formula of the desired form. 
So assume t{xi, . . . ,Xn) = 5(^21, ^22). Then sh(i2i) ~ 
g^(se,Se) and sh(i22) = c^ Therefore, i2i = g{t2ii,t2i2) 
where either sh(i2ii) ~ sh(i2i2) = Se or fi — t2 is 
false. Similarly, either ^22 € {a, b} or ti = t2 is false. 
Therefore, t{xi, . . . ,x„) = g{g [1211,1212), a), t(xi, . . . ,x„) = 
g{g{t2ii,t2i2),b), or ti = t2 is false. If t(xi,...,x„) = 
gig{t2ii,t2i2),a,) then we may replace the ti = fe with the 
formula 

3cyi,y2- Xo = g{g{yi,y2),a) A yi = t2ii A 2/2=^212 

and similarly in the other case. By continuing this process 
by the induction on the structure of the term t{xi, . . . , x„) 
we either conclude that ii — t2 is false, or we conclude that 
ti = t2 is equivalent to a conjunction of formulas of the 
desired form. 

Conversion of atomic formula of form ii < t2 is analogous 
to the conversion of formulas ti = £2- 



To see the decidability it now suffices to convert 
the formulas of the form xo ~ g{g{xi,X2),a) and 
^0 = g(g(xi,X2),b) into formulas TSucco(ii,i2) and 
TSucci(ii,i2). Expressibility of xo ~ g{g{xi,X2),a) fol- 
lows from the fact that the following relationship between 
Ao, Ai,X2 is expressible in MSOL: 

Xo = {w-O\we Xi} U {w • 1 I m; G A2} 

Similarly, the expressibility of xo — g{g{xi,X2),b) follows 
from the fact that 

Xo ^ {w ■ \ w e Xi} U{w-l\we X2} U {e} 

is expressible in MSOL. We conclude that the set of closed 
BJ- formulas that are true in SIT is decidable. 



7.3 Embedding Terms into Terms 

We next give an embedding of the set of all terms into Te. 
As in Section |7.1| te be the unique solution of the equation 
*e = g{g{te,te),a) and let 



Define 



£4(3:1, 3:2, 3:3, 3:4) = g{g{gig{xi,X2),x3),te),X4) 

ta = tjl{te,te,a,a) 
tb = t4{te, te, a, b) 
tg{xi,X2) = t4{xi,X2,b,b) 



Then define the homomorphism hr from the set of all terms 
to the set Te by 

hria) = ta 
hrib) = tb 

hT{g{tl,t2)) = tg{hT{tl),hT{t2)) 

Then hr is embedding of the set of all terms into the subset 
subset Te of all terms. The term algebra operations a, b, g 
map to ta,tb,tg and < maps to <. 

Note that, if it were possible to define a predicate P{t) 
such that 

P{x) ^=^ 3y.hT{y) = X (104) 

then we could express all statements of SIT within the BJ- 
subtheory, and therefore SIT would be decidable. 

The fundamental problem with specifying P{x) is not the 
use of two bits to encode the three possible elements {a, b, g}, 
but the constraint that if a term contains a subterm of the 
form t4(ii, t2,a, a) or £4(^1, fa, a, b) at some even depth, then 
ti = t2 = te. Compared to the relationships given by con- 
structor g, this constraint requires taking about successor 
relation at the opposite side of the paths within a tree, see 
Section 17.61 



7.4 Subtyping Trees of Known Shape 

We next argue that if we allow the logic to have a copy of 
bounded quantifiers 3s,Vs for every constant shape s, we 
obtain a decidable theory. To denote constant shapes in a 
finite number of symbols we consider in addition to term 
algebra symbols g^, c^ the expressions that yield solutions of 
mutually recursive equations on shapes; the details of the 
representation of types are not crucial for our argument, see 

e.g. m 
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Consider a closed formula in such language. Because ev- 
ery variable has an associated constant shape, we can com- 
pute the set of all shapes occurring in the formula. This 
means that all variables of the formula range over a finite 
known set o f sha pes. This allows us to define the predicate 
P given by ( 1041 as a disjunction of cases, one case for ev- 
ery shape. E)efine /imin, /imax functions that take a shape and 
produce a lower and upper bound for terms of that shape: 

'^min \^ ) — ''a 

/lrT,in(<7'(t!,i2)) = tg{h^-,„{t\),h^,„(tl)) 

/lmax(c'') = tb 

h^,4g'{t\,tl)) = tg(/l„,ax(t!),/lmax(t|)) 

If Si, . . . , s„ is the list of shapes occurring in a formula, we 
then define a predicate P specific to that formula by 

n 
P{t) = Y (ftrrin(sO < t A i < /i-n^axW) 
1=1 

We can therefore define P{t) and use it to translate the 
formula into a BJ- formula of the same truth value. There- 
fore, structural subtyping with quantification bounded to 
constant shapes is decidable. 

For decidability of the structural subtyping recursive 
types it would be interesting to examine the decision proce- 
dure for MSOL and determine whether there is some unifor- 
mity in it that would allow us to handle even quantification 
over shapes that are determined by variables. 

7.5 Recursive Feature Trees 

We next remark that certain notion of subtyping of recursive 
feature trees is decidable. By a feature tree we mean an infi- 
nite tree built using a constructor which takes other feature 
trees and an optional node label as an argument. In this sec- 
tion we consider the simple case of one binary constructor / 
and assume only one label denoted by 1. Hence, an empty 
feature tree is a feature tree, and if ti and fe are feature trees 
then so are /'^(ti,t2) and f^{ti,t2)- We represent an empty 
feature tree e by an infinite tree that has all features e. We 
compare feature trees as follows. Let < be defined on the 
features {e, 1} as the relation {{e, e), (e, 1), (1, 1)}. Define < 
on trees as the least relation such that: 

1. e < t for all terms t; 

2. ii < t'l and i2 < ^2 implies 

rHtut2)<n{t[,t',) 

for all ri,r2 G {e, 1} such that ri < r2. 

The decidability of feature trees follows from Section [7.1| 
because of the isomorphism hp between the set of terms Te 
and the set of feature trees. Here hp is defined by: 

hpie) = te 
hF{f{ti,t2)) = g{hF{ti),hFit2),a) 
hpifitiM)) = gihF{ti),hF{t2),b) 

The feature trees as we defined them have a limited fea- 
ture and node label alphabet. This is not a fundamental 



problem. Muchnik's theorem [57] gives the decidability of 
MSOL of trees over arbitrary decidable structures. It is 
reasonable to expect that the decidability of MSOL over 
decidable structures yields a generalization of the result of 
Section |7.1| and therefore the decidability of feature trees 
with a richer vocabulary of features. 

The crucial property of our definition of feature trees is 
that features can appear in any node of the tree. Hence, 
there are no prefix closure requirements on trees as in Sec- 
tion [73] which is responsible for relatively simple reduction 
to MSOL. 

7.6 Reversed Binary Tree with Prefix-Closed Sets 

It is instructive to compare the difficulties our approach 
faces in showing the decidability of structural subtyping of 
recursive types with the difficulties reported in [48j. In |48l 
Section 5.3] the authors remark that the difficulty with ap- 
plying tree automata is that the set x — f{y,z) is not reg- 
ular. By reversing the set of paths in a tree representing 
a term we have shown in Section |7.1| that the relationship 
X = f{y, z) becomes expressible. However, the difficulty now 
becomes specifying a set of words that represents a valid 
term, because there is no immediate way of stating that a 
set of words is prefix-closed. If we add an operation that 
allows expressing relationship at both "ends" of the words, 
we obtain a structure whose MSOL is undecidable due to 
the following result [521 Page 183]. 

Theorem 95 MSOL theory of the structure with two suc- 
cessor operations w ■ and w ■ 1 and one inverse successor 
operation ■ w is undecidable. 

The case that is of interest of us is the dual to Theorem [95] 
under the word-reversing isomorphism: a structure with op- 
erations ■ w, 1 ■ w, iti ■ has undecidable MSOL closed 
formulas. 

Instead of expressing prefix-closure using operations w-0, 
It) ■ 1, let us consider MSOL over the structure that contains 
only operations • w and 1 • w, but where all second-order 
variables range over prefix-closed sets. This logic also turns 
out to be undecidable. 

Let PCI be the set of prefix-closed sets. For each word w, 
there exists the smallest PCI set containing w, namely the 
set C{w) given by: 

C{w) — {w \ w -< w} 

Every subset of C{w) in PCI is a of the form C{wi) for some 
word wi. Define PSucco and PSucci on PCI by: 



PSucco(Xi,X2) 

PSuCCi(Xi,X2) 



3w. Xi = C{w) AX2 = C(0 • w) 
3w. Xi = C{w) A Xa = C(l • w) 



Consider a monadic theory PrefT with relations PSuccq and 
PSucci where second-order variables range over the subsets 
of PCI. It is easy to see that PrefT corresponds to the first- 
order theory of non- structural subtyping of recursive types, 
with subset relation C corresponding to subtype relation <, 
empty set corresponding to the least type _L, PSucco(Xi, X2) 
corresponding to X2 ~ /(Xi,_L), and PSucci(Xi, X2) cor- 
responding to X2 ~ f{A-,X2)- The first-order theory of 
non-structural subtyping was shown undecidable in [48], so 
PrefT is undecidable. An interesting open problem is the de- 
cidability of fragments of the first-order theory of structural 



subtyping. This problem translates directly to the decid- 
ability of the fragments of PrefT, a monadic theory with 
prefix-closed sets, or, under the word-reversal isomorphism, 
the decidability of fragments of the monadic theory of two 
successor symbols with sufEx-closed sets. 

8 Conclusion 

In this paper we presented a quantifier elimination proce- 
dure for the first-order theory of structural subtyping of 
non-recursive types. Our proof uses quantifier elimination. 
Our decidability proof for the first-order theory of structural 
subtyping clarifies the structure of the theory of structural 
subtyping by introducing explicitly the notion of shape of a 
term. 

We presented the proof in several stages with the hope of 
making the paper more accessible and self-contained. Our 
result on the decidability of S-term-power is more general 
than the decidability of structural subtyping non-recursive 
types, because we allow even infinite decidable base struc- 
tures for primitive types. We view this decidability result 
as an interesting generalization of the decidability for term 
algebras and decidability of products of decidable theories. 
This generalization is potentially useful in theorem proving 
and program verification. 

Of potential interest might be the study of axiomatiz- 
ability properties; the quantifier elimination approach is ap- 
propriate for this purpose [311130) . we did not pay much at- 
tention to this because we view the language and the mech- 
anism for specifying the axioms of secondary importance. 

Our goal in describing quantifier elimination procedure 
was to argue the decidability of the theory of structural sub- 
typing. While it should be relatively easy to extract an algo- 
rithm from our proofs, we did not give a formal description 
of the decision procedure. One possible formulation of the 
decision procedure would be a term-rewriting system such as 
|11| ; this formulation is also appropriate for implementation 
within a theorem prover. Our approach eliminates quanti- 
fiers as opposed to quantifier alternations. For that purpose 
we extended the language with partial functions. The use of 
Kleene logic for partial functions seems to preserve most of 
the properties of two valued logic and appears to agree with 
the way partial functions are used in informal mathematical 
practice. An alternative direction for proving decidability 
of structural subtyping would be to use Ehrenfeucht-Fraisse 
games |53l Page 405]; [T3] uses techniques based on games 
to study both the decidability and the computational com- 
plexity of theories. 

The complexity of our the decidability for structural sub- 
typing non-recursive types is non-elementary and is a conse- 
quence of the non-elementary complexity of the term alge- 
bra, whose elements and operations are present in the theory 
of structural subtyping. Tools like MONA [25! show that 
non-elementary complexity does not necessarily make the 
implementation of a decision procedure uninteresting. An 
interesting property of quantifier elimination is that it can 
be applied partially to elimination an innermost quantifier 
from some formula. This property makes our decision pro- 
cedure applicable as part of an interactive theorem prover 
or a subroutine of a more general decision procedure. 

In this paper we have left open the decidability of struc- 
tural subtyping of recursive types, giving only a few rem arks 
in Section [7] In particular we have observed in Section [7. 1| 
that every formula in the monadic second-order theory of the 



infinite binary tree [B] Page 317] has a corresponding formula 
in the first-order theory of structural subtyping of recursive 
types. In that sense, the decision problem for structural 
subtyping recursive types is at least as hard as the decision 
problem for the monadic second-order logic interpreted over 
the infinite binary tree. This observation is relevant for two 
reasons. 

First, it is unlikely that a minor modification of the quan- 
tifier elimination technique we used to show the decidabil- 
ity of structural subtyping non-recursive types can be used 
to show the decidability of recursive types. Because of the 
embedding in Section [TT] such a quantifier-elimination proof 
would have to subsume the determinization of tree automata 
over infinite trees. 

Second, the embedding suggests even greater difficulties 
in implementing a decision procedure for the first-order the- 
ory of structural subtyping (provided that it exists) . While 
we know at least one interesting example of weak monadic 
second-order logic decision procedure, namely [25) we are 
not aware of any implementation of the full monadic second- 
order logic decision procedure for the infinite tree. 

The relationship between the non-structural as well as 
structural subtyping and monadic second-order logic of the 
infinite binary tree and tree like structures [58] requires fur- 
ther study. In that respect the work on feature trees [361 137| 
appears particularly relevant. 
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